jack marchant

Agentic coding habits to accelerate your engineering workflow

Sun, 19 Apr 2026 09:00:00 +0000

Coding agents aren't just speeding up development, they're changing what it means to be a senior engineer.

I've been coding with agents for a while, watching the tools evolve from rough experiments to something I now rely on daily. Last year I was hand-rolling workflows with Cursor. Now I'm all in on Claude, for better or worse.

It's an unusual time to be a software engineer. I'm also very aware of where this sits in my career.

As you become more senior, your role shifts from writing code to improving how others write it. That comes through teaching, guiding system design, and making architectural decisions shaped by experience.

A junior engineer can now generate production-level code without fully understanding it. That's new.

The shift isn't about writing code faster. It's about designing the system that writes the code.

So these are some habits I've picked up that have served me well so far.

1. Make it work, tune it, automate it

This applies broadly to engineering, but it's especially relevant when working with coding agents.

When building a new skill or workflow, it's easy to get stuck trying to design the perfect solution upfront. Instead:

Make it work: Get a minimal set of steps that produces most of the desired outcome.
Tune it: Improve reliability, remove manual steps, and refine edge cases.
Automate it: Lock it in so it runs consistently without intervention.

Perfection is the end state, not the starting point.

2. Review intent, structure, and complexity, not line by line

Once teams adopt agentic workflows, pull request volume tends to increase. That raises the question: why are we still reviewing code line by line?

In reality, full line-by-line comprehension was always rare.

Effective reviews focus on:

Intent: what problem is this solving?
Structure: how is the system organised?
Complexity: where are the risks and failure modes?

Senior engineers already work this way. They map systems mentally, question assumptions, and evaluate how code evolves over time.

That doesn't change with agent-generated code. If anything, it becomes more important.

When speed matters, reviewing together with the author can dramatically accelerate understanding.

3. 80/20 rule for planning

I rarely plan to the nth degree when working with agents.

Instead, I aim for a plan that captures roughly 80% of the value, and deliberately leave 20% undefined.

The key is knowing what needs precision and what can be discovered through iteration.

Plans are cheap. Execution reveals truth.

Rather than endlessly refining a plan, it's often faster to build something concrete, identify weaknesses, and iterate.

There have been times I've built multiple versions of the same solution, extracted the learnings, and used that to inform a better approach.

4. Trust in determinism, commit to scripts

Watching an agent generate scripts on the fly sounds impressive. Most of the time, it's doing something I don't actually want.

It also introduces unnecessary variability.

When a workflow requires data processing or repeatable steps, I generate a script once, review it carefully, and reuse it.

Determinism matters.

A known, verified script run 100 times is far more valuable than something "clever" generated differently each time.

5. Outsource what is easy, invest in what is hard

To focus on high-impact work, I constantly look for what can be offloaded.

This usually includes:

Code generation
Git workflows
Managing stacked pull requests
Prioritising review feedback

These tasks still require oversight, but not constant attention.

By identifying repeatable parts of my workflow and delegating them to agents, I create more space for solving harder problems.

My work has shifted from writing and designing code to planning and verifying it.

A simple state machine is often enough to formalise and automate these workflows over time.

If a task is repeatable, it's a candidate for removal.

Conclusion

The biggest shift isn't that agents can write code. It's that they force us to think in systems, not implementations.

Senior engineers have always been moving in this direction. Agents just accelerate it.

The real question isn't how good the tools get, but how quickly we adapt to working through them.

Will AI replace software engineers?

Wed, 08 Apr 2026 09:00:00 +0000

The question everyone’s asking

It’s the topic on everyone’s mind, especially software engineers. AI is coming for people’s jobs. I don’t buy it.

The radiology parallel

I recently listened to Jensen Huang on the Lex Fridman Podcast where he describes the change radiology went through with the introduction of AI, where rather than replacing radiologists, there was actually a greater demand for radiologists after they could now process the work much faster. The work of reading the scans can now be done by AI but the work of radiologists changes to fill the gaps where a human is required.

View on X

This particular story has stuck with me and I can draw a few parallels to the work software engineers do today, which goes well beyond the work of writing code.

What software engineering actually is

Software engineering is fundamentally about making tradeoffs, given a set of constraints. Over the length of a career, an engineer builds up a context window much larger than any model on the market today, making difficult decisions and using intuition to make software and learn from the effects of doing it wrong or making mistakes along the way. It’s with this knowledge that an engineer can apply reasoning to any situation, sometimes with emotion influencing the outcome.

Speed isn’t the metric

AI, at least the passing of context and source material through an LLM, is making many parts of the software engineer’s role faster. But this is not a metric on its own. A ratio of speed, direction, and quality of outcome should be used to measure the effectiveness of AI to software engineering.

In your own work, throughput isn’t enough. The same quality of work you’ve always done is not enough. AI is a multiplier of many facets of engineering but I think one that is being downplayed at the moment is the quality of AI at the hands of a skilled professional engineer.

The baking analogy

That tension — between automation and craft — isn’t unique to software. As I think about other industries and how automation has played a part, baking comes to mind: in the beginning, bakers moulded dough by hand and over time the craft evolved. By the 20th century, machines could produce the same bread with higher consistency. Which one is better? This will depend on whom you ask. Some prefer the hand crafted sourdough experience, where some need the consistency and stability of a mass produced bread.

Software may become something similar, where the hand crafted artisan engineering is valued but in much lower quantities than before. Reserved for those who need to fine tune and finesse the output, refining algorithm implementations in much the same way we see lower level languages written today. For those who don’t need to critique or analyse the code, AI will automate this step entirely, such that the fact there is code written will be a forgotten footnote.

Where does that leave engineers?

Software engineering will continue to be about judgement and choosing the right tool for the job, which now includes coding agents that can be layered in or given as much control as possible. This isn’t a one size fits all solution. A human engineer has the ability to decide where AI fits into the software engineering workflow.

The better question

Will AI replace engineers is the wrong question to ask. Instead, ask where in the process of engineering AI is going to replace manual work.

The question isn’t whether AI replaces engineers. It’s which parts of engineering get automated first — and whether you’re the one deciding where AI fits, or waiting for someone else to decide for you.

engineering the loop

Mon, 16 Mar 2026 09:00:00 +0000

I've been spending more time thinking about how I work with agents than the actual code they produce. That shift in focus, from output to process, is what engineering the loop is about.

The loop is the automation pipeline that sits between me and a coding agent. Done well, it lets me hand off well-defined pieces of work to Claude Code and stay focused on the bigger picture: system design, architecture decisions, the things that actually require judgment.

What the loop looks like

The loop is a long-running script that runs in the background. It isn't triggered manually each time — it's always on, watching for work to pick up and responding to state changes across my projects.

My current setup is fairly simple. I write specs into Jira tickets or pull context together in Confluence pages. Claude Code picks those up as a starting point. The goal is that by the time I'm handing something off, the work is already decomposed enough that the agent isn't guessing at intent.

The quality of input matters far more than I expected. A vague ticket produces vague code. A well-structured spec with clear acceptance criteria produces something reviewable on the first pass.

Writing a good spec is also a forcing function. If I can't describe the work clearly enough for an agent to act on, I probably haven't thought it through clearly enough to build it myself either.

What's changed

Two things have shifted since I started taking this seriously.

The first is focus. When smaller, mechanical tasks have a clear home, a defined input, a defined outcome, an agent to run them, they stop taking up mental space. I'm spending more time on architecture and less time holding implementation details I'd rather not be thinking about.

The second is throughput on the boring parts. Not glamorous, but real. The SDLC has a long tail of tasks that are well-understood but time-consuming: boilerplate, tests, doc updates. These are good candidates for the loop. The human decision has already been made, execution is what remains.

Where it still breaks down

Review and QA is the current bottleneck. The loop produces code faster than I can meaningfully review it, which creates its own kind of pressure. Speed without oversight isn't actually a win.

I've started giving the loop context about branches that are currently in review. It can scan open PRs, read the comments, and make a first pass at deciding which ones represent unresolved feedback that needs action versus ones that are already addressed or are just discussion. That triage is useful — it means I'm not manually combing through threads — but it still surfaces a list I have to work through. The loop can flag; the judgment call on what to actually do sits with me.

The other rough edge is knowing what not to hand off. Some tasks look automatable on the surface but require judgment at each step, the kind of decision-making that's hard to encode in a spec. I'm still calibrating where that line is.

Keeping the human in the loop

Automation can create the illusion that the human is optional. It isn't.

The loop works best when it's handling execution, not making decisions. But the line between the two isn't always obvious, and it's easy to drift into a pattern where the agent is being asked to decide things that should still sit with you.

A few places where I've noticed this creep in: choosing between competing implementation approaches, deciding what's in scope for a ticket, and judging whether a failing test is a real problem or a test that needs updating. These all look like execution tasks on the surface. They're not. They're judgment calls, and if you're not careful, you end up with an agent making them silently.

The corrective I've landed on is building the approval step directly into the ticket flow rather than managing an agent session in real time. When I assign work to an agent, the first thing it does is post a plan as a comment on the Jira ticket. I can review that plan on my own time, in the same place the spec lives. If something's off, I leave a comment and the agent produces a revised plan. If it looks right, I approve it and implementation starts.

That loop — plan, review, revise or approve — keeps me in the decision path without requiring me to babysit a session. I'm not watching the agent work. I'm reviewing its intent before it acts, which is the right place to catch problems.

Keeping a human in the loop isn't a constraint on the process. It's what makes the output trustworthy.

What to take from this

If you're building your own loop, a few things worth starting with:

Invest in the spec, not the prompt. A well-written ticket with context, constraints, and a clear definition of done is more valuable than a clever one-liner. The agent will only be as good as the brief.

Identify your handoff criteria. Before running anything through the loop, ask: is the decision already made, or is the agent being asked to make it? Execution is a good handoff. Decision-making usually isn't.

Treat review as part of the loop, not outside it. Build in the time. An unreviewed PR is just deferred work.

The loop isn't finished. It's a system under construction. But even in its current rough state, it's changed how I spend my day, and that's enough to keep iterating on it.

Using coding agents in my workflow in 2025

Fri, 05 Dec 2025 09:00:00 +0000

There's a lot going on in the world of AI right now, so in an effort to cut through the noise, I thought it would be interesting to write down my current practices with AI — both as a time capsule to refer back to next year, and as a challenge to rethink anything I assumed wasn't previously possible without AI. The assumption being that a lot will change a year from now.

My Everyday Tools & Workflow

I'm using Cursor a lot. It's become my main editor after having used VS Code for a long time. Like everyone else, I started by giving it some test cases to write and letting autocomplete finish code — which felt superhuman at the time.

Now, I'm running research and planning tasks before building, or if the task is small enough, one-shotting a prompt to take a trivial task off my mental plate.

I'm also experimenting with coding agents to perform more meaningful tasks and accelerate development at a speed that feels like cheating. The good thing is, a lot of the time it's for tasks I've already thought a lot about but haven't had the time to write by hand.

Why Agents Work

Although using agents still feels like a stretch sometimes, it's really just another layer of abstraction to prevent me from using valuable brain compute time on syntax, code structure or design. With enough experience, you know what “good” looks like — clear separation of concerns, implementing interfaces, and single responsibility principles, to name a few.

Clean code can be produced by AI with minimal context (e.g. “use these files as reference”).

In a large codebase an issue could arise that code is not strictly "clean" and so it would produce more of the same, you can easily tell agents to write code that conforms to an existing structure.

Building your own agent, apart from being the cool new thing to do, allows you to customise to your preferred workflow and optimise context on your projects without exposing details to the cloud or sharing between teams. I still think there's a place for personal agents alongside team agents, each with their own strengths. Personal agents might be concerned with the individual developer workflow, where team agents help run checks and balances against diffs before they can be merged, with more context about the wider system.

My Current Workflow for Non‑Trivial Tasks

Perform research and planning to produce a specifications document.
Iterate intensely on the specs — in some ways this can be treated like code (versioned and improved over time).
Build incrementally with an agent, allowing smaller context windows and clear completion points.
Review the LLM's “thought process” for producing the code.
Create documentation for how the feature works.

At each step there's a review of the agent's decisions and approach — tweaking and redirecting where needed. Some tasks require more experimentation than others and so it's cheap to produce; it's easy to throw away and start from scratch.

A Changing Development Lifecycle

There's still a lot we're all learning and experimenting with. I expect the development lifecycle to be very different by the end of 2026.

A recent live stream talk was particularly eye‑opening. It connected a few dots for me:

Two Big Takeaways

Minimal context windows — allows agents to focus on specific tasks and not get distracted. There's a strong parallel to humans here — we talk about context switching and information overload. It turns out agents suffer the same.
Controlling agent process with something as simple as a while loop — when something so fundamental is taken to a basic level, it opens up possibilities I hadn't considered before.

It's Not Just About Productivity

It's not just productivity for its own sake. AI is changing what we spend our time doing as engineers.

We're moving away from writing perfect code from scratch. Now, with agents, it's also specs — which can produce intent for the code very cheaply.

Anyone who has debugged legacy code knows half the battle is understanding not just what the code does, but why the engineer chose that approach in the first place. This is huge for future systems we maintain — where codebases could be rewritten at intense pace because the original plans already exist, and can be updated with minimal effort to produce a wildly different version of the system.

Some Interesting Media

Beyond Autocomplete: A practical guide to AI-Assisted Development

Fri, 19 Sep 2025 09:00:00 +0000

To truly leverage the power of AI in software engineering, we need to move beyond simple code completion. In this post, I want to explore some practical techniques for using AI as a development partner, from pair programming to bug hunting, and even look at what the future might hold for AI other parts of software development.

The New Pair Programmer: Your AI Sounding Board

We've all been there: stuck on a problem, talking it through with a rubber duck on our desk. The act of articulating the problem often illuminates the solution. AI assistants have become the ultimate interactive rubber duck. Instead of just stating the problem, you can have a dialogue.

By asking the AI questions like, "Can you explain this block of code to me in simple terms?" or "What are the potential edge cases for this function?", you are forced to structure your own thoughts. The AI's response, whether it's perfectly correct or slightly off, provides a new perspective that can break you out of a mental block. This interactive process is a powerful evolution of the classic debugging technique, turning a monologue into a productive conversation.

AI-Driven Test Generation: A Second Opinion on Functionality

Writing comprehensive tests is crucial, but it can be tedious. This is an area where AI can shine. Instead of just writing tests yourself, you can present a function to an AI model and ask it to generate a suite of test cases.

This approach offers two key benefits. First, it accelerates the process of writing boilerplate test code. Second, and more importantly, the AI might interpret the function's purpose differently than you intended. The tests it generates can reveal ambiguities in your code or highlight edge cases you hadn't considered. It acts as an impartial reviewer, testing what the code actually seems to do, not just what you intended it to do.

This is typically where most engineers start experimenting with AI, beyond simple code completion and moving towards code generation.

Accelerating Feature Development with Contextual Prompts

One of the most effective ways to use AI is for brownfield projects where established patterns already exist. Instead of a generic prompt like "write a function to fetch user data," you can provide the AI with specific context.

For example, you could provide an existing API endpoint function and prompt it with: "Following this example, create a new endpoint to handle product data, including validation for the 'price' and 'stock' fields." By giving the AI a clear template, you guide it to produce code that aligns with your project's existing structure, conventions, and style. This makes adding new features faster and helps maintain a consistent codebase.

Increasing Confidence by Detecting Bugs (or Proving the AI Wrong)

AI can be an excellent "second pair of eyes" for catching subtle bugs. You can paste a piece of code and ask the model to review it for potential issues, race conditions, or security vulnerabilities. It's surprisingly effective at spotting common mistakes.

Interestingly, even when the AI is wrong, it provides value. If the model flags a piece of code as buggy and you investigate and prove it's correct, you've just engaged in a deep-dive review of your own logic. This process of validating your code against the AI's critique significantly increases your confidence in its correctness.

What's Next? AI in System Design

What's next for AI? Can models gain enough context and direction to assist with architecture and system design? This is the next frontier, but it comes with significant challenges. A good system design is all about making trade-offs on constraints such as:

cost vs. performance
consistency vs. availability
scalability vs. complexity

For an AI to make meaningful contributions here, it needs a vast amount of context. It would need to understand business goals, budget constraints, team skill sets, and existing infrastructure. Simply asking it to "design a scalable microservices architecture" will likely result in a textbook answer boiling the ocean with "best practices" that aren't practical for your specific situation.

The future of AI in system design will likely involve a highly interactive process, where architects use AI to explore different design patterns, model performance trade-offs, and generate diagrams, but the final strategic decisions will still rest on human experience and deep contextual understanding.

My Take on the Current State of AI for software engineering

In my view, the real power of AI in its current form is not as an autonomous coder, but as a thought partner. Using an LLM as a sounding board or a way to fact-check your own assumptions gives a necessary structure to the development process. It forces you to articulate your problem clearly and highlights potential issues much faster, allowing you to focus your attention on double-checking the most critical parts.

This is why I use these tools both inside and outside of the code editor. Limiting myself to simple autocompletion is like using a smartphone only for making calls. The true value comes from a more holistic integration: brainstorming solutions, drafting documentation, and debugging complex logic in a conversational interface. The AI isn't replacing the engineer; it's augmenting the development workflow and becoming an indispensable part of an engineer's toolkit.

How does a relational database index really work?

Thu, 29 Feb 2024 06:22:00 +0000

A common question in software engineering interviews is how can you speed up a slow query? In this post I want to explain one answer to this question, which is: to add an index to the table the query is performed on.

What is an index in a relational database?

An index in a relational database is a key-value mapping for one or many columns where the key is the data in the column and the value is the primary ID of the row that contains the data. A primary index also exists in every database table so querying by ID is always fast. A custom index is a reverse-lookup to that primary index.

How does an index speed up database queries?

An index tells the database which rows contain specific values, without having to scan each row individually.

A common way to understand it is the index of a phone book. If I was trying to find someone with the last name "Martin" in a phone book, I would skip to the back pages to the index, find names starting with M and start looking from the referenced page number.

A database does the same lookup with an index.

Let's take a look at a more concrete example. Suppose we create a new table:

CREATE TABLE `users` (
  `id`          bigint          NOT NULL AUTO_INCREMENT PRIMARY KEY,
  `name`        varchar(255)    NOT NULL,
  `status`      int             NOT NULL,
  `joined_on`   datetime        NOT NULL
);

A query to find the users where status is 1 would result in a full table scan.

explain select * from users where status = 1;
> ... | Extra
        Using where

If we add an index to the status column because we know it's a common access pattern for our application:

ALTER TABLE users ADD INDEX status(status);

When we run the explain again, we can see it

explain select * from users where status = 1;
> ... | key     | .. | Extra
        status  | .. | Using index

When the database performs the operations for this query it will use the index instead of scanning every row, which starts making a big difference when there are millions of rows to scan.

Handling complex queries with a composite index

Continuing with this example, let's assume we have another access pattern which is to find all the users with a specific status who joined after a certain date ordered from most to least recent.

explain select * from users where status = 1 and joined_on >= '2024-02-24' order by joined_on desc;
> ... | key     | .. | Extra
        status  | .. | Using where; Using filesort

Without an index on joined_on column the query could still benefit from the index we added on status. It may not be the best performance, however, with the addition of the joined_on filter and the sort, which would result in a filesort operation which could make overall performance worse.

We could go ahead and create an index for joined_on but the database may still choose the status index and perform a filesort.

What would have better performance is a composite index with both status and joined_on.

ALTER TABLE users ADD INDEX status_joined_on(status, joined_on);

After adding the index, this is what the explain looks like:

explain select * from users where status = 1 and joined_on >= '2024-02-24' order by joined_on desc;
> ... | key                 | .. | Extra
          status_joined_on  | .. | Using index condition; Backward index scan

An index can be stored in either ascending or descending order depending on the definition. We see Backward index scan because we need the reverse order (descending) to sort results for the query above.

If we were to create the index where joined_on column is sorted in descending order we would see the Backward index scan removed:

status_joined_on(status, joined_on DESC)

Now we can run the explain again:

explain select * from users where status = 1 and joined_on >= '2024-02-24' order by joined_on desc;

This is an ideal index for this type of query.

Summary

We explored creating indexes on relational databases and evaluated performance at each step. What did we learn along the way?

An index in a relational database is a key-value mapping for one or many columns to tell the database which rows contain what values without having to scan each row.
Indexes can speed up query performance at the cost of write performance, though the former typically outweighs the latter.
For complex queries, it's possible to create a multi-column index. Ordering the columns is an important factor in its performance.
A descending index can help with searches for most recent data.

Refactoring for Performance

Mon, 12 Feb 2024 09:00:00 +0000

I spend most of my time thinking about performance improvements. Refactoring is tricky work, even more so when you’re unfamiliar with the feature or part of the codebase.

Some refactoring might be simple, but in this post I’ll attempt to dissect my approach to solving performance issues in the hopes it’ll provide value for others.

Where do we start?

Before we can design a solution to a performance issue we must understand the problem. For example, is a page not loading or is it very slow? Are there more queries than necessary to get data? Can we see a slow part in the process? How do we know it’s slow? Answering these questions first is a must.

Once we can see the slow part over and over again, if code is the culprit, I start by taking that piece out and seeing how fast it could be without it even though it may break or be incomplete. This helps me to see what the maximum amount of improvement we’ll get through performance optimisation – as if code didn’t run at all.

This is the incentive. If I know how much performance improvement is possible, it’s worth investing time into figuring out a solution. If I see marginal or little to no improvement, I’m either in the wrong place or it wasn’t as slow as I thought - time to move on.

The solution to the performance problem could be as simple as adding an index and as complicated as a complete rebuild. Code optimisation will naturally take longer than query optimisation because the behaviour of the code will generally change. If the problem is not that the query is slow but that the query runs thousands of times in a single request - those are two different problems to solve.

Going from prototype to production

The easiest way I get from identifying something slow to being able to fix the problem is to prototype the way I think it should work to be fast. Creating a prototype gives me the confidence the solution works at a high level, without addressing all of the edge cases. At minimum, I try to identify blockers standing in the way.

Once I’ve proven the solution works, I can invest more time to understand the product behaviour and the experience. How does the user actually use this feature? What are they trying to accomplish?

To be clear: this is the hardest point and often where the solution can fall over. If I misunderstand requirements or forget to include some parts, however minor they may seem, it undermines the performance optimisation and deflates any confidence in it when it comes time to release it.

Confidence is a fickle thing - it can be gone in an instant and hard to get back quickly. Customers are never going to applaud performance improvements - maybe it should have been fast to begin with - but many performance improvements add up to a better experience.

Testing builds confidence

Testing a performance improvement is like any other test of a change with the addition of a specific metric that you want to improve. For example if the goal of the refactor was to reduce page load time, compare the previous and current page load speed. If reducing the number of queries was the goal, show that the number of queries has gone down. I often start with manual tests to confirm impact on the user experience supported by some quantifiable metric. Screenshots, videos or links to observability metrics all support the fact that the refactor does what was intended.

Once I’ve covered the performance gains, the next thing to verify is correctness. To do this, I start with a few manual scenarios and compare the result of using the feature with and without my change. The most comprehensive way to do this is through a test spreadsheet which marks pass or failure for some scenarios. A user clicks a few buttons and assert the result is the same. Using a spreadsheet helps maintain regression tests and add test cases over time. Some features won’t be big enough that you’d need it, but even if you never share the results with anyone and use it for your own testing - it beats remembering all cases every time you test.

One day you could even turn those manual tests into automated tests, if that’s not readily possible now. At least creating automated tests for any new code is a task worth doing.

How do performance improvements differ from features? Feature development creates new functionality where it didn’t exist before, so there’s often time to assess its effectiveness and test with customers who might be more forgiving if something is not working. To break an existing feature that may be slow is to take it away. We must have extra care when dealing with something that is working today for some, even if it’s slow.

A performance improvement must be:

Cheaper or faster
At least equal, ideally better behaviour

It’s an unforgiving task, but rewarding when you can quantify performance improvements with a better experience for customers. Monitoring the outcome after release is a good place to start, even in the short term to verify the improvement was a success.

The hardest question, which will remain unanswered, is how can we know when performance optimisations are done?

Exploring Async PHP

Wed, 31 May 2023 18:00:00 +0000

Asynchronous programming is a foundational building block for scaling web applications due to the increasing need to do more in each web request. A typical example of this is sending an email as part of a request.

In many web applications, when something is processed on the server we want to notify people via email and it's common for this to be a separate HTTP request to a third-party service such as SendGrid, Mailchimp etc.

This becomes a more than trivial example when you need to send a lot of emails at once. In PHP, if you want to send an email and the HTTP process takes 100ms to complete, you'd quickly increase the total time for the request by sending tens or hundreds of emails.

Of course, any good third-party email service would provide a bulk endpoint to negate this, but for the sake of the example - let's say you want to send 100 emails and each has to be processed individually.

So, we need to make a decision: how can we move the processing of the emails into a separate process so that it doesn't block the original web request? That is what we'll explore in this post, particularly all the different ways this can be solved in PHP with or without new infrastructure.

Using exec()

exec() is a native function in PHP that can be used to execute an external program and returns the result. In our case, it could be a script that sends emails. This function uses the operating system to spawn a completely new (blank, nothing copied or shared) process and you can pass any state you need to it.

Let's take a look at an example.

 /dev/null &';
$emails = ['joe@blogs.com', 'jack@test.com'];

// for each of the emails, call exec to start a new script
foreach ($emails as $email) {
    // Execute the command
    exec(sprintf($command, $email));
}

// record the finish time of the web request
$finish = microtime(true);
$duration = round($finish - $start, 4);

// output duration of web request
echo "finished web request in $duration\n";

send_email.php


Output
$ php src/exec.php
finished web request in 0.0184
The above scripts show the web request still finishes in milliseconds, even though there is a blocking sleep function call in the send_email.php script.
The reason it doesn't block is because we've told exec with the inclusion of > /dev/null & in the command that we don't want to wait for exec command to finish so we can get the result, meaning it can happen in the background and the web request can continue.
In this way, the web request script is simply responsible for running the script, not for monitoring its execution and/or failure. 
This is an inherent downside of this solution, as the monitoring of the process falls to the process itself and it cannot be restarted. However, this is an easy way to get asynchronous behaviour into a PHP application without much effort.
exec runs a command on a server so you have to be careful about how the script is executed, particularly if it involves user input. It can be hard to manage using exec particularly as you manage scaling the application, as the script is likely running on the exact same box that is processing external web requests, so you could end up exhausing CPU and memory if many hundreds or thousands of new processes are spawned via exec.
pcntl_fork
pcntl_fork is a low-level function which requires PCNTL extension to be enabled and is a powerful, yet potentially error prone method for writing asynchronous code in PHP.
pcntl_fork will fork or clone the current process and split it into a parent and a number of child processes (depending on how many times it is called). By detecting the Process ID or PID we can run different code when in the context of a parent process or a child process.
The parent process will be responsibile for spawning child processes and waiting until the spawned processes have completed before it can complete.
In this case, we can have more control over how the processes exit and can easily write some logic to handle retries in case of failure in the child process.
Now, on to the example code for our use case to send emails in a non-blocking way.
 'john@example.com',
        'subject' => 'Hello John',
        'message' => 'This is a test email for John.',
    ],
    [
        'to' => 'jane@example.com',
        'subject' => 'Hello Jane',
        'message' => 'This is a test email for Jane.',
    ],
    // Add more email entries as needed
];

$children = [];

foreach ($emails as $email) {
    $pid = pcntl_fork();

    if ($pid == -1) {
        // Fork failed
        die('Error: Unable to fork process.');
    } elseif ($pid == 0) {
        // Child process
        sendEmail($email['to'], $email['subject'], $email['message']);
        exit(); // Exit the child process
    } else {
        // Parent process
        $children[] = $pid;
    }
}

echo "running some other things in parent process\n";
sleep(3);

// Parent process waits for each child process to finish
foreach ($children as $pid) {
    pcntl_waitpid($pid, $status);
    $status = pcntl_wexitstatus($status);
    echo "Child process $pid exited with status: $status\n";
}

echo 'All emails sent.';
In the above example using pcntl_fork we can fork the current process, which copies the parent process into new child processes and wait for the execution to complete. Additionally, after forking the child processses to send emails, the parent process can continue doing other things, before ultimately ensuring the child processes have finished.
This is a step above using exec where we were pretty limited in what is possible because the scripts are completely separate contexts so monitoring is not possible from an overall perspective.
We also gain process isolation as each child process runs in a separate memory space and does not affect other processes.
By tracking the process IDs we can effectively monitor and manage execution flow.
A downside in forking requests in this way, directly from the web request (parent process) is that by waiting for the child processes to finish, there's no benefit to the response time of the original request in doing it this way.
Fortunately, there is a solution to this and it's to combine both exec and pcntl_fork to get the best of both worlds, which looks like this:

Web request uses exec() to spawn a new PHP process
The spawned process is passed a list of emails as a batch
The spawned process becomes the parent as it forks to send each email individually

This can all happen in the background, rather than blocking the original request.
Let's take a look at making this work:
 /dev/null &';

// Execute the command
echo "running exec\n";
exec(sprintf($command, $emails));
$finish = microtime(true);

$duration = round($finish - $start, 4);
echo "finished web request in $duration\n";
pctnl_fork_send_email.php

The beauty of this solution, albeit more complicated, is that you can set up a separate process all together whose responsibility it is to run and monitor forked processes for the purpose of doing work asynchronously.
AMPHP
amphp (Asynchronous Multi-tasking PHP) is a collection of libraries that allow you to build fast, concurrent applications with PHP.
The release of PHP 8.1 in November 2021 shipped support for Fibers which implement a lightweight cooperative concurrency model. 
Now we know a little bit about how amphp works and why it's exciting for the future of PHP programs, let's take look at an example:
onResolve(function () use ($to) {
        echo "Email sent to: $to\n";
    });
}

$emails = [
    [
        'to' => 'john@example.com',
        'subject' => 'Hello John',
        'message' => 'This is a test email for John.',
    ],
    [
        'to' => 'jane@example.com',
        'subject' => 'Hello Jane',
        'message' => 'This is a test email for Jane.',
    ],
    // Add more email entries as needed
];

foreach ($emails as $email) {
    $future = async(static function () use ($email) {
        $to = $email['to'];
        $subject = $email['subject'];
        $message = $email['message'];
        sendEmail($to, $subject, $message);
    });

    // block current process by running $future->await();
}

echo "All emails sent.\n";
The above script is a very simple version of running things asynchronously. It will create a new fiber asynchronously using the given closure, returning a Future (object).
This is a much simpler version than rolling your own and does the heavy lifting for you, which is key for building an application as you don't need to worry about how the work is queued internally - you just know it happens asynchronously.
Queues and Workers
A solution to this problem also exists outside of PHP and prior to PHP 8.1 it could be considered the gold standard because it's language independent and highly scalable.
The use of queues such as Amazon SQS, RabbitMQ or Apache Kafka has been a widely accepted solution for some time.
Queues are pieces of infrastructure to run workers indepdenent to your application for the processing of any work asynchronously. This is not without risk or downside either, but tried and tested over time.
Let's get into an example:
Sender, in this example, is typically your existsing web application.
sender.php
 'us-east-1',
    'version' => 'latest',
    'credentials' => [
        'key' => 'YOUR_AWS_ACCESS_KEY',
        'secret' => 'YOUR_AWS_SECRET_ACCESS_KEY',
    ],
]);

// Define the message details
$message = [
    'to' => 'john@example.com',
    'subject' => 'Hello John',
    'message' => 'This is a test email for John.',
];

// Send the message to SQS
$result = $client->sendMessage([
    'QueueUrl' => 'YOUR_SQS_QUEUE_URL',
    'MessageBody' => json_encode($message),
]);

echo "Message sent to SQS with MessageId: " . $result['MessageId'] . "\n";
Workers are an additional deployment of running code to process jobs.
worker.php
 'us-east-1',
    'version' => 'latest',
    'credentials' => [
        'key' => 'YOUR_AWS_ACCESS_KEY',
        'secret' => 'YOUR_AWS_SECRET_ACCESS_KEY',
    ],
]);

// Receive and process messages from SQS
while (true) {
    $result = $client->receiveMessage([
        'QueueUrl' => 'YOUR_SQS_QUEUE_URL',
        'MaxNumberOfMessages' => 1,
        'WaitTimeSeconds' => 20,
    ]);

    if (!empty($result['Messages'])) {
        foreach ($result['Messages'] as $message) {
            $body = json_decode($message['Body'], true);

            // Process the message (send email in this case)
            sendEmail($body['to'], $body['subject'], $body['message']);

            // Delete the message from SQS
            $client->deleteMessage([
                'QueueUrl' => 'YOUR_SQS_QUEUE_URL',
                'ReceiptHandle' => $message['ReceiptHandle'],
            ]);
        }
    }
}

function sendEmail($to, $subject, $message)
{
    sleep(3); // Simulating sending email by sleeping for 3 seconds
    echo "Email sent to: $to\n";
}
This solution is comprised of two parts:

Sender (pushes a message to an SQS queue)
Worker (receives a message from a queue and sends an email)

It can be scaled through increasing the number of workers relative to the number of messages that get sent by any number of senders.
By using a queue, the worker is completely independent from the sender and can be written in any language as the communication between sender and worker is through JSON messages.
Which solution is best?
It's almost impossible to say out of all of the solutions we've explored above, which would be the best for your application because although they all aim at solving the problem of running asynchronous code with PHP the implementations are quite different and have different benefits and drawbacks.
To summarise each option in a few points:
exec()

Perhaps the simplest and most effective way to run PHP scripts async
Fraught with potential security implications particularly around user input
Nothing is shared can be both a blessing and a curse
May cause increase in existing server resources (CPU/Memory)

pcntl_fork()

Allows management of parent/child processes to customise behaviour
Can be abstracted away in a simpler API for your application
Cloning the current process may cause other downstream issues

AMPHP

Requires PHP 8.1 for the user of Fibers
Library has abstracted away the "hard parts" of running async code
Steep learning curve over other more traditional methods (understanding event loop and multi-tasking in PHP)

Queues and Workers

Language independent, flexible for any use case
Introduces a distributed system (can be a good or bad thing in the long run)
Many solutions around and different queue providers to make it easy

Conclusion
The main reason I wanted to dive a bit deeper into all the different possibilities of async code in PHP is to understand how (if at all) the introduction of Fibers in PHP 8.1 changes how we can write async programs in the future.
There are many solutions available without requiring PHP 8.1 that have been battle tested, but it's interesting to see the direction the PHP language is going in to compete with the likes of Golang and Elixir, both of which support async programming and have done for years.
Ultimately, I would probably still reach for a Queue/Worker approach given the scalability and cross-platform/cross-language support - however I think over time we might see libraries such as AMPHP become more feature rich and make this problem easier to solve without introducing new infrastructure.
To see the code samples used in this blog post, you can find them on GitHub.



Maintaining feature flags in a product engineering team
Fri, 01 Apr 2022 18:00:00 +0000
I have mixed feelings about feature flags. They are part of the product development workflow and you would be hard pressed to find a product engineering team that doesn’t use them. Gone are the days of either shipping and hoping the code will work first time or testing the life out of a feature so much that it delays the project.
The benefits of using feature flags certainly outweigh the bad, but it doesn’t stop teams from cursing them every time a major bug is reported or an incident occurs as a result of enabled (or disabled) feature flags.
In this post I will discuss the benefits and some drawbacks of using feature flags, to help you learn from some of the lessons I’ve personally learned, in the hopes that you can avoid the mistakes.
First, let’s understand what a feature flag does, and why it’s there:
A feature flag is (at its simplest - there are more advanced controls you can use) an on/off switch to release some new functionality to your product through the code base. 
Teams can safely ship code knowing the feature can be enabled for small groups of users at a time, and released to more customers as confidence in a feature or behaviour grows. 
Before feature flags, you only had one shot to ship code to production and make sure it works, which meant a longer build up to releasing code for the first time, or complicated infrastructure to support canary releases. 
The main problem with feature flags is what happens when you have too many and they start conflicting with each other or you have so many different flows to test that the team spends much longer on a feature than they should. 
This leads me to the first lesson:
Lesson #1
The number of feature flags you maintain will spiral out of control
Every time you create a feature flag, you’re introducing a different behaviour for your code that may only be initially released to parts of your user base, meaning you now support two different behaviours (feature flag on and feature flag off). 
This is empowering for a growing product and engineering team. As the number of feature flags in use in production grows, so too does the frustration of testing all of those different cases and receiving bug reports where your first instinct is to check which feature flags are enabled or disabled. 
Keeping track of feature flags means making them attributed to your team so there’s ownership of the feature flags and tracking the rollout status, including ensuring rollout continues to happen, or the flag is retired. 
When feature flags have already spiralled and are out of control, the best thing to do is pause development of new features and clean up any flags that are rolled out or no longer required. 
Getting feature flags back under control should be a priority, given the impact on development and testing time for related features. 
This is a hard lesson to learn because there’s only one way out. Clean up the flags!
Lesson #2
Clean up feature flags regularly, make it part of the development cycle
After some time in production, feature flags can become stale and turn into technical debt, which must be paid back at some point, or risk cumulating over time and being at the point of no return. In some ways you will always have to live with feature flags, and they become part of the work you do. 
The difference between a feature flag that is new and part of the feature actively being worked on, versus a flag that has lost purpose is enormous. 
To avoid this dysfunctional reality, we must clean up feature flags after the feature has been rolled out or no longer needed. Sometimes this will be a minor piece of work that involves only one person making the code change and testing it and other times it can require re-testing an entire feature. 
In my experience, how well you have built up your automated testing around the feature will impact whether it’s a minor change or a major one. 
Recently in my team, we had built up a lot of feature flags for a variety of reasons, whether it was changing teams, forgotten features or slow rollouts and so we had to take a week or two to clean up around 10-12 feature flags in a short period of time. 
We ended up doing the work for these in a short time frame, then merging and releasing the changes incrementally and over a longer period just in case anything went wrong. 
This proved to be successful in the end and the team swarmed on the work to get it done. 
We’re working on keeping track of feature flags more closely now to make it part of our development cycle to clean up feature flags rather than waiting for the eventual build up. 
When we release a new feature, with a corresponding flag, we document it and ensure it keeps rolling out and create a ticket for the future to clean it up.
Progressing with the rollout usually means opening up the feature to more customers and this brings me to the final lesson.
Lesson #3
Always be rolling out
Feature flags should be temporary and are meant to increase the velocity of your team by allowing you to ship quickly and get real feedback from smaller groups of users in a safe way. 
Flags, therefore, should be intended to be rolled out completely to your whole user base at some point. It’s normal to start with a small group and then build up incrementally to larger groups, but this should always be on a timeline. 
Once you forget about it or move on to the next thing and leave the flag there, it will become stale and your team falls into the trap of having to maintain it and test both states of the flag should a change to that area of the code base be required. 
There’s no one right time frame for a feature flag to exist, it will always depend on the feature and the group of customers using it to give you feedback directly or indirectly, through usage. 
That’s why keeping track of the current state of feature flags in your control is important including managing the continual rollout to more users. 
As you find bugs you can pause the rollout until the bugs are fixed, but if you haven’t hit any road blocks it’s critical to keep forging ahead so removing the feature flag becomes possible once it has been made available for all of your users. 
Feature flags are both a blessing and a curse, it’s probably no secret to most engineering teams. What’s missing in my opinion is a framework to manage feature flags over time and throughout engineering organisations. 
They help keep the product working and make it easy to rollback changes without code deployment, and give on-call engineers peace of mind when they can safely turn off a flag that has caused an incident. 
If left unchecked, feature flags can slow engineering teams down to a crawl so create each feature flag with caution and a plan for its eventual removal. 
Feature flags have given product teams the confidence to move fast, with a plan to rollback at the click of a button, but with great power and flexibility, comes a cost which should not be underestimated.


What I've learned doing technical interviews
Fri, 18 Mar 2022 18:00:00 +0000
When I first started interviewing candidates for engineering roles, I was very nervous. The process can be quite daunting as both an interviewer and interviewee. The goal for the interviewer is to assess the candidate for their technical capabilities and make a judgement on whether you think they should move to the next round (there’s always a next round). Making a judgement on someone after an hour, sometimes a bit longer, is hard and error prone.
There are a lot of ways to assess someone for an interview for an engineering role. Depending on the role itself there may be certain requirements such as a senior role needing more focus on system design and manager roles more focussed on the team dynamics rather than their ability to write code.
In all types of interviews there is a lot you can learn from doing interviews, whether you’re a candidate or interviewer but for this article I’m going to talk about some of the things I’ve learned while doing many technical interviews over the past few years.

Disclaimer: I am not pretending to be an expert on interviewing nor have the perfect process, but these practices are what I wish I knew before I started interviewing more regularly.

Having a clear, repeatable structure helps you and other interviewers ask the right questions
When I started interviewing there was some process about what questions to ask, how the interview was structured and how it changed depending on seniority, but it was often not relevant and I always found it odd to roll off 20-questions one after the other, especially when the questions probably weren’t relevant for the candidate.
I also found stating the structure of the interview at the start allows the candidate to understand what’s coming and what time we should finish so we know to progress if one section is taking too long.
While you do want to have some canned questions you ask in case you can’t think of any, I try to think more in terms of the topics or capabilities I want to assess rather than direct questions.
For example, rather than asking a specific question about React or VueJS, I would ask what technologies have you used to solve writing reusable components on the frontend? This broader question allows the candidate to answer in a way that’s relevant to them.
If we want to asses whether they have experience with observability, we can ask about how they would know if something fails or is working correctly in production? 
Setting these expectations up front helps you (the interviewer), any other interviewers and the candidate ask any questions before progressing and know what to expect during the interview.
You don’t need to know everything to interview someone
I used to dread interviewing someone who clearly had more experience than I did. They would probably say something I didn’t understand or ask a clarifying question I couldn’t answer thus prompting the question - why am I the interviewer?
While it doesn’t happen as much anymore, it’s not because I learned everything, it’s because I stopped trying to know the answer to every question. It’s ok to say you haven’t used that technology before or are unsure how something works, even as an interviewer. 
Instead, ask a follow-up question if the candidate knows about it and you will learn something new.
Better yet, this will demonstrate the candidate’s ability to teach something to a co-worker, which is a very valuable skill to have as an engineer.
Your experiences are more likely to be different to every engineer you interview, so it’s better to acknowledge this rather than try to combat it or be worried about being stumped on a question.
Win-win.
When to help the interviewee along by giving hints
Sometimes you’ll interview someone who has just started their career and hasn’t had much experience yet. They might need some extra pointers to get to where you’re trying to get them to go. This isn’t a sign of weakness in an interview, but an opportunity for you to show how they can learn more from the company they’re interviewing for.
They may also just have different ideas to you and the other interviewers, which is also perfectly fine, albeit encouraged.
System Design is often not as simple as choosing a tool that can do the job and being done with it. More often there are a set of trade-offs that have to be considered, so offering the candidate a chance to understand the intent behind the question could be one way of aligning on what you’re getting at without specifically telling them what you’re looking for.
Some people will just need more time to think through their answers, as these whiteboard interviews can often be a source of anxiety for many engineers who don’t do it on a regular basis. Remember the desired outcome is not to trip up the candidate into saying the wrong thing, but rather providing them a chance to show you what they know and how they arrived at that conclusion.
Asking the right questions helps form the basis for your judgement about hiring/not hiring
It’s hard to know what questions to ask if everyone’s experiences are different, they haven’t used the same technologies as you, so we can instead ask questions that start a discussion about a broader topic, so the candidate can tell you what they know about that topic rather than trying to ask pointed questions.
Typically, I won’t have a prepared list of questions to ask in the interview, but will instead know what topics to discussed based on previous experience or knowing what we’re trying to assess from the candidate. It’s less about specific technologies and more about their problem solving ability, critical thinking and ability to make trade-offs and explain them clearly. 
Sometimes engineers will have to make decisions that aren’t ideal, but are based on a set of tradeoffs (maybe they don’t agree with those either), but are part of business goals or customer requests. Understanding that this is where the tradeoffs come from and have a variety of reasons for being considered is more valuable to me than any specific technology we could learn more about.
Asking a question you don’t know the answer to but think the candidate will is a good way to build trust with the candidate that you’re not trying to trip them up on a question and are genuinely wanting to learn from them - remember they are interviewing you and your company too!
Practice is the only way to get better
Fortunately (or not) the only way to get better at interviewing is to do more of it. Avoiding awkward silences, especially when an audio delay happens over a video call, comes with practice and doesn’t come naturally to everyone, myself included!
Eventually, I got better at either stalling by talking about something else until I could think of the next think to ask about or knowing what question I want to ask next in advance.
Being curious about the candidate’s experience helps broaden your own perspectives and helps them relax by talking about things they understand well.
You want the candidate to do well, you’re not trying to trip them up on a trick question
Above almost everything else, you should want the candidate to do well. It’s your responsibility as the interviewer to set them up for success, focusing on their strengths and letting them show you what they know rather than asking a specific set list of questions about topics they might not be familiar with.
This will result in a better interview experience for everyone involved, and even if they don’t progress to the next round or eventually get the job, they could still come back again or leave with a good interview experience to learn from.
Evaluating a candidate for an engineering role is a bit different to other roles and has its own intricacies due to the technical nature of what you’re trying to asses, but more often than not I have found technical skills can be learned but attitude and how you treat other people is much harder to change. So, even though we’re assessing technical ability, it’s not an excuse to ignore basic decency of giving respect to everyone on the call. Thankfully, there have not been too many occasions where this is has impacted an interview.
Interviewing is a tricky subject and most of my interviews recently have been done over hangouts, which has its own challenges. Having a good experience in an interview should be a given, even if it doesn’t work out in the end. 
It’s just as much an interview of the company you’re representing as it is of the candidate. It’s the first chance someone has to experience your company and leaving a bad impression will generally last forever. I’ve learned a lot over the interviews I’ve done so far, and hopefully I’ll learn more in the interviews to come!


Using a Dependency Injection (DI) Container to decouple your code
Wed, 03 Jun 2020 00:00:00 +0000
Dependency Injection is the method of passing objects to another (usually during instantiation) to invert the dependency created when you use an object. A Container is often used as a collection of the objects used in your system, to achieve separation between usage and instantiation.
What is Dependency Injection
Take for example, the repository pattern whereby you use a separate class to handle database access so that you can separate that functionality from your application's business logic. For example, you might instantiate a new Repository in a Service:
class BookService
{
    public function getSomeBooks()
    {
        $repository = new BookRepository();
        return $repository->getAll();
    }
}
It doesn't really matter at this point what the getAll function does in BookRepository, but the main point is that by instantiating dependencies in the same place as where they are used creates an implicit dependency on the BookRepository leading to tightly coupled and hard to change code. In the example above, we're no longer able to test it without having a database connected, nor can we switch it out at runtime or setup, meaning less overall flexibility.
Instead, we could declare a private member variable on the BookService class, and assign it during instantiation of the BookService class itself, and ensure whatever is passed in, implements a specific interface, so that functions you call on the repository are guarunteed to have been implemented.
class BookService
{
    private $repository;

    public function __construct(RepositoryInterface $repository)
    {
        $this->repository = $repository;
    }

    public function getSomeBooks()
    {
        return $this->repository->getAll();
    }
}
Now, BookService has no idea what specific repository will be passed in, and it doesn't need to care because it knows it implements the RepositoryInterface. This is an example of Dependency Inversion Principle and is critical to understanding why Dependency Injection (DI) in PHP (and most other languages) is an important concept and to understand why a DI container exists.
How a DI Container makes this a lot easier
We've seen how dependency injection can make testing easier, along with decoupling your code so that it can easily change over time, you might also consider what your codebase might look like if you had more complex objects than the BookService and you had to use it all over your codebase.
Everywhere you need to get some books, you need to instantiate both the BookService and it's dependency BookRepository, so that it can be passed into the constructor.
$bookRepository = new BookRepository();
$bookService = new BookService($bookRepository);
This is a great first step forward, but there's more that can be done to control how a BookService is instantiated and with what repository, since now it can be switched out with relative ease.
This is where a container comes in. If you've ever used Slim Framework, you might have noticed you can set up a DI container for your app.
$container = new \Slim\Container;
$app = new \Slim\App($container);

// Add a service to Slim container:
$container = $app->getContainer();
$container['BookService'] = function ($container) {
    $bookRepository = new BookRepository();
    return new BookService($bookRepository);
};
Wherever in your code you need a new BookService, you can simply use the container to build a new object for you with the repository.
// Use your service
$bookService = $container->get('BookService');
$bookService->getSomeBooks();
This makes any code that uses the BookService independent of the service as well, so by using the container, we're inverting the dependency on the BookService.
As you might have realised, the function we defined as the value for the BookService key in the container, will be passed the container as an argument, meaning you can pull off any other dependencies that exist in the container already, such as the repository itself:
$container['BookService'] = function ($container) {
    return new BookService($container->get('BookRepository'));
};
There are endless possibilities for how you can use a Dependency Injection (DI) Container to your advantage, to decouple related objects and remove implicit dependencies so that your software can grow over time with boundaries in place. 


3 simple tips to get better at working from home
Fri, 17 Apr 2020 18:00:00 +0000
Working from home has been thrust upon those lucky enough to still have a job. Many aren’t sure how to cope, some are trying to find ways to help them through the day. Make no mistake, this is not a normal remote working environment we find ourselves in, but nonetheless we should find ways to embrace it. 
These three tips will help you get the most out of yourself in productivity, while maintaining a healthy work-life balance (in whatever way possible in our current circumstances). 
I have worked from home regularly for a number of years, however it’s only since it became a full-time situation that I put these tips into practice and it has helped maintain my sanity, if nothing else. 
Set a start and end time
There’s no better feeling than clocking off at the end of a work day and while we probably don't have much more than a bit of exercise to look forward to, having a set time that you finish up work and check out for the day means you brain can adjust and get into the habit of switching off.
Especially now as we’re confined to our homes, it’s important to distinguish a time when you’re not “at work”. 
Equally, starting work at a predictable time gives your brain a chance to clock on to work mode and focus. I have fallen into the trap of not having set a time to start, only to watch the clock fly by without getting anything meaningful done. 
It’s easy to say, much harder to do and continue doing, but just like anything else, practice makes perfect. I set specific times that I must start and finish by, and I try to be realistic with those times. 
Have a break (particularly lunch)
Just as important as setting times to start and end your day, it is worth your time to take a short break from your work to eat and drink.
Sometimes when you’re in the zone you forget about taking a break and you realise 3 or 4 hours have gone by without you leaving your seat. 
A lunch break is like rebooting yourself (much like your computer). Your mind has probably been running around all morning and has a thousand things to think through. Taking time to step away from your workspace gives you the control to regain focus on what can be done in the afternoon. 
I find that taking short, frequent breaks is beneficial to my own productivity and helps me to finish tasks more easily. It’s difficult for me to switch between different things, but when I finish something, I walk away and come back ready to go on the next thing. 
Dedicate one room to being your workspace
A workspace has to feel comfortable and suited to your individual needs. If you can, it’s ideal to have a single room you can make your own so that when you’re in this room, it’s work time and perhaps more importantly, when you’re not in this room, work time is over. 
Being able to leave a work environment makes it easy to separate work from the rest of the day. It also comes in handy when you’re on a video call and need to shut off the rest of your home from whatever noises may occur. This "home-office" also gives the people you live with the indication that you're working, meaning you may not want to be interrupted.
--
These little tips can make a big difference in your own wellbeing and productivity while working from home. I have become stricter with these boundaries as I’ve made the transition to doing it full-time and I’m so glad that I did because it has allowed me to get the best out of myself at work, while still being myself at home. 
There are loads of great resources out there to help you with working from home, here are a few that I have found useful:

The developers guide to working from home
Staff working from home for the first time? These six tips will ease the transition
5 Tips for Staying Productive and Mentally Healthy While You're Working From Home



Making Software - a three step process
Tue, 14 Apr 2020 08:22:00 +0000
One of the most useful tips that has guided much of my decision over the years has been this simple principle: three steps, executed in sequential order;

Make it work
Make it right
Make it fast

These steps outline the process through which software should be made. You should refer back to these steps and discover for yourself which step you are currently in while creating software. This will also help you identify whether you need to or indeed can move to the next step.
An important distinction to make up-front is that you don't necessarily need to complete all steps in order to ship software to users. Let me explain.
Make it work
The first step should be the most obvious, but the key to (and the most difficult) is acknowledging you are at step 1. Making software work is about glueing all of the pieces together until the thing you're trying to build actually works.
Imagine you're building a skateboard and you take a plank of wood and screw 4 wheels to the bottom. There should be criteria to allow you to recognise when it's finished and can move to the next step, for example:

Can you stand on it?
Does it move forward when you push off?
If the answer to both of these questions is yes, then congratulations you've got a working skateboard.

Keen readers will poke holes in the analogy - the skateboard will break easily because the wheels aren't correctly fastened to the plank and will buckle after the first ride. While this may be true, the first step is making it work and we may not be expecting to produce thousands of these skateboards and allow customers to purchase them at this stage.
Bringing this back to software, if you catch yourself on step 1 and you're already thinking about how to optimise for performance or you've got the best abstraction idea that is extensible to the nth degree, then the battle may very well be lost because if you can't make it work, nothing else matters.
This step is more about what you don't do straight away, as opposed to what you do.
Make it right
Continuing the skateboard analogy, this next step provides you with the time built into the process to take a step (pardon the pun) back and look at the big picture. Such as, the wheels need to be fitted correctly with the proper materials and safety considerations to withstand the rough and tumble expected in the riding of a skateboard.
This is no different to the wear and tear of software - without the proper guards in place such as tests, abstractions and extensibility in place the software will likely buckle under the pressure of real users.
This step is the right time to take the working thing and build it properly, regardless of whether you start it from scratch. The first step is more discovery than anything else and provides you with the confidence and knowledge of how to build the thing, so that in this step you can build the thing right.
Making software right ensures you have the correct checks and balances in place, such as applying principles to common problems and giving it the best chance of long-term sustainability.
At each step, a decision needs to be made about whether to progress to the next - for example there might be times in building software when making something work is more important than making it right. The key difference between this and pure negligence is the fact that this trade-off is a conscious choice, which makes it hard to see in hindsight unless well documented. It is often seen as tech debt, accrued for a purpose, but don't be fooled into thinking it's at all similar to financial debt (a common misconception, that may be for another time). 
While I wouldn't recommend stopping before making it right, there are times where making something is better than not making anything at all. The hardest part is accepting the snowball effect this will have later down the track and whether you are actually prepared for the true cost.
Making it right, while technically a choice (hence it being a different step) is probably where most early software needs to land to be effective. Going further than this may become detrimental unless you have a reason to make it fast.
Make it fast
The final step is where all guns are blazing, software is often in production and you need the last piece of the puzzle to ensure the software works as intended (i.e. fast enough for the user). This step is all about optimising what you have done in the previous two. To keep the skateboard analogy going, at this step we would want to focus on the speed of the rider ensuring they maximise speed.
In software on the other hand, this may only happen once it's actually in the hands of users, so you know where the bottlenecks are. Sure, you could throw more hardware at it temporarily but eventually you have to make the software fast enough to scale correctly and appropriately.
In each step there are trade-offs but in this step the smallest decision can have the biggest impact, so you should always make data-driven decisions based on real usage rather than making educated guesses and hoping for the best. In time, experience will often tell you where the bottlenecks are, but this isn't always the case and depends on the system. Making it fast requires skill and practice and sometimes you don't even need this step until the software becomes popular enough that you have a reason to focus on speed. This is why it's best to analyse the data before embarking on this step.
One step at a time
In each step there is particular nuance to doing it right and doing it well. It's not like a recipe where the instructions tell you exactly what's required. All you can do is take it one step at a time and figure out where you can draw the lines around it. At times it may even be harder than building the software on it's own, but being able to do this consistently will help you in the long-run.
I have found great focus from following these three steps when building any software professionally.


Help me, help you - Code Review
Thu, 24 Oct 2019 12:18:00 +0000
Code Reviews are one of the easiest ways to help your team-mates. There are a number of benefits for both the reviewer and pull request author:


Product knowledge sharing through anecdotes or code samples


Sharing techniques for writing maintainable code


Differing perspectives collaborating on a single solution


I find it helpful to consider the intent of each party participating in a code review. Let’s think about what’s important to both the reviewer and author and how each can contribute to the success of the other.
As an Author, I want to submit my code for peer review, so that I can gather feedback and iterate on my solution
When you submit a new pull request, depending on your previous experiences, you might feel nervous about the response from your peers. It’s normal to be attached to your own code if you’ve spent a long time writing it, however this attachment can be a detriment to your willingness to accept feedback. While it’s not easy not to take feedback personally, it’s better to assume that the person providing feedback has the same good intentions you had when writing the code.
The time it takes from when you open a pull request, to when it is merged varies widely based on it’s content, risk and testability (among other things). When specific people or teams are the best people to review your code, it helps to reach out to those people and let them know you’d like them to review it. It is your responsibility to follow up with them and get your code reviewed.
It can make a huge difference if the pull request has a clear (and sometimes, thorough) description, breaking down context for the change, what the effect of the change is, and how it will be tested. In an ideal world, we’d be submitting automated tests along with our code, but this is not always possible. We should strive as a team to prove that our code works as expected, and including tests (manual or automated). This will aid the reviewer in understanding the reason for your change, so that they can in turn help you, by giving feedback. 
Pull requests that are matched to tickets are often seen as a 1-1 relationship. I don’t subscribe to this idea, and instead I’d prefer to see one large code change broken down into multiple chunks (1 ticket, many pull requests) that can be easily pieced together (if necessary) and follow a progression to a releasable version of your code. Too often we try to build the whole solution and release it to users all at once, often forgetting the risk involved with doing so. In many cases, hiding new code paths with feature flags can make your code releasable, without necessarily running in production straight away. Breaking code down into manageable chunks, makes the review process easier for both you and the reviewer.
TLDR;


Make the pull request clear and concise.


Follow-up reviewers to look at your pull request


Provide comprehensive reasoning, where possible for the change


Break down large code changes into smaller chunks


As a Reviewer, I want to review code to ensure quality and accuracy and provide relevant feedback to help the Author move forward
Reviewing code can be difficult at times - when there are a large number of lines or not enough context in the description, it can make reviewing painful. As the reviewer, you want to ensure the correctness of the code in terms of the objective, rather than aesthetic - however the quality of the code organisation and formatting can have an impact on the maintainability of the code in the long term. In a large team with many contributors, or an open source codebase, this is particularly important. Thankfully, there are a number of automated formatting and code linters available in many different languages to help code authors meet this standard more easily. 
When reviewing code, it’s easy to look at the code at face value and forget about the intent, requirements and restrictions under which the author wrote said code. Without knowing all of this, the reviewer should make an attempt not to assume the worst of the author and instead work with them to provide feedback in a polite and professional manner. 
As a reviewer, you can only provide feedback from the information you’ve been given, whether that’s code, a description of the problem or a diagram of the solution. However, sometimes that is not enough, and you need to also seek further information from the author in order to provide great feedback. Your approach may vary depending on the size of the pull request, as with anything else in software development - it’s not a one size fits all situation, however it’s up to you as a reviewer to provide feedback that helps move the process forward.
TLDR;


Automate as much of the formatting concerns as possible


Discover the intent behind the code, not just the code itself


Seek further information, if something is not clear


Before your next pull request
You can only get out of the pull request process as you put in, both as a reviewer and an author. On both sides of the equation, there are many things to consider before code gets shipped to production, many of which I haven’t discussed today. Before you review or open your next pull request, understand how you can make the process easier for you and your colleagues - help them, help you.


A practical guide to Test Driven Development
Thu, 12 Sep 2019 12:18:00 +0000
It’s been a while since I last wrote about why testing is important, but in this post I thought I would expand on that and talk about why not only unit testing is important, but how a full spectrum of automated tests can improve productivity, increase confidence pushing code and help keep users happy. 
Why do we need to test code?
Code gets tested every time users interact with your software, whether it’s through an application or part of an API. The unfortunate reality is that by the time your code is in the hands of users, it’s too late to find out it doesn’t work. 
To reduce the chances of this occurring, we test code during development, after development (sometimes called Quality Assurance testing), right before releasing the code to users and even right after releasing. 
At each step, it’s possible we could find a defect in the code and need to revert or write a fix to remedy the situation. The later the defect is found, the larger the impact and slower the turnaround to getting it fixed. 
It is for these reasons that we test code at each step, building up confidence that the code does what we expect so that it may progress to the next stage in the development and release process.
Definitions
The following are definitions of terms I use throughout this post, and serve as a description of how I think about each type of test (these aren't necessarily textbook definitions).
Unit Test: A test where the subject is an isolated block of code, typically a single function with no dependencies.
Integration Test: A test where the subject could be a function with dependencies, or multiple functions/classes tested simultaneously.
User Acceptance Test: The closest test to how a user will interact with your software, sometimes referred to as a Functional Test.
Tests are a crucial - regardless of when they happen
It is always in your best interests as a developer writing code to find bugs as early as possible. The ideal scenario being that you find it as you’re working on the code itself, by making a change and then running automated unit tests. This way, you can identify the problem, fix it by writing a test case for that scenario and moving on.
Not all bugs are created equally however and by the nature of software development, some code is harder to test than others. This is why we introduce other forms of testing later in the development cycle, such as integration testing and user acceptance testing. 
These three forms of testing: Unit, Integration and User Acceptance, build on top of each other to create a test pyramid. The general idea being that unit tests should be easy to create and run as they are without external dependencies. Integration tests allow you to see how different modules, when hooked up together, respond to certain inputs. Finally, User Acceptance tests may place an entire vertical slice (incorporating many parts of your software which may be slow or brittle) under test. As you go from most (unit), many (integration) to some (user acceptance), confidence in the overall system to be working correctly should increase. 
Having tests doesn’t make bugs disappear completely, but it does reduce the frequency of them, along with ensuring that changes you make don’t have unintended side effects. 
Now that we’ve discussed some of the terminology and theory behind testing practices in software development, it’s much easier said than done. So, let’s talk about some ways you can incorporate testing into your development workflow. 
In a large codebase, it’s worth having a few strategies for testing depending on the code needing to be tested, for example:
New code integrating with existing code:
In this scenario it makes sense to unit test any new code you write, as much as possible. The point at which you integrate the new code into an existing code path, you may not be able to easily test, but because you have confidence from the unit tests, you can try either an integration or user acceptance test. The former will likely be running the existing code path and making sure the new code is being run, while simultaneously ensuring the existing code runs successfully. The latter, may require manual or automated testing of the entire feature, during which time your code is run. This has a slower feedback cycle, but equally an important step nonetheless. 
Fixing a bug in existing code:
When you find a bug in your code, whether it’s during development or reported by a user, the best way to fix it is to write a test (any type will do) and then fix the code, ensuring that the test passes. 
This will have a short term and long term effect:

It will ensure you have actually fixed the bug. 
And, allows the test to be run again in the future, making sure further changes haven’t caused a regression. 

Approaching the Test Pyramid from scratch:
Without any tests or very little, often the code is hard to test so it can be worthwhile starting from the top of the pyramid and working downwards. User Acceptance testing can be a good way to get started because you can mimic how a user interacts with the software. Then, as more tests are added, confidence that overall features are working might enable engineers to start building integration and unit tests with a bit of refactoring along the way.

Having tests doesn’t make bugs disappear completely, but it does reduce the frequency of them, along with ensuring that changes you make don’t have unintended side effects. 

The effects of Testing over time
Improving the maintainability of a codebase through increasing test coverage over time has a dramatic affect on teams, individuals and businesses. There are a number of fallacies surrounding testing that exist in software development teams in regards to testing that hinder their collective ability to be productive. 
A system that’s hard to test becomes a black box for developers, because it’s impossible to say with any certainty how something works. That being said, it is possible to open up the box and take parts out to figure out how they work. The best way I’ve found to learn a system is by introducing new tests.
There’s a common belief that Test Driven Development can only be practiced successfully through writing tests as if they are requirements, then writing code to satisfy the requirements. I would suggest that in reality, this is not how much of the software in the world is created - because it’s hard to do. 
Instead, Test Driven Development to me, is the practice of incorporating any kind of testing into your development cycle, meaning you’re not always writing tests first - it could be after you’re done or midway through - the important part is to use automated and manual testing together to drive a faster feedback loop between writing code and knowing whether or not it works. 
In practice there are trade offs, just as in any other engineering decision, which need to be considered when adding tests to your development workflow. Let's stop debating about whether TDD means red-green-refactor, all it does is discourage people from actually writing tests, for fear they're not doing it right.
There are always going to be tests, but which ones, and how will they be run? In answering these questions and developing with tests, you’ll find it increases your own productivity writing code and in the end it will improve the reliability of your software for your users.


The Facade Pattern
Fri, 05 Jul 2019 09:00:00 +0000
Design Patterns allow you to create abstractions that decouple sections of a codebase with the purpose of making a change to the code later a much easier process.
They are a set of blueprints for solving specific sets of problems, and hopefully don’t over-complicate. 
There’s nothing worse than seeing an abstraction in a codebase that actually makes it harder to understand than without the abstraction.
Of course, it’s a trade off but often times an easy way to see when you should create an abstraction is when you start to see a pattern or repetition in the behaviours in your code - not necessarily just duplicated code. 
I’ve been digging in to some design patterns lately, and one that I had to research again was the Facade Pattern.
If you don’t know what it is, you have probably already seen or used it many times before, but after reading this article, hopefully you’ll be able to identify the Facade Pattern in your own code. 
What does Facade mean?
Facade literally means a deceptive outward appearance, and that’s potentially the wrong angle for thinking about solving a problem with software.
When you create a new function, it’s unlikely you’ll name it anything other than exactly what the function does. Naming things is hard in itself but that should at least be the aim. 
The Facade Pattern used in your code should be a simple interface for doing something more complicated. It should group related things together to make it easier to use.
If you’ve ever integrated a third party library into your application you may have subconsciously used this pattern without realising. Say you’re building an app where users can purchase things, you might want to create a new customer account, charge the customers credit card and send an invoice email. 
An example of a Facade
Rather than having to think about each of these requirements whenever a customer makes a purchase, we can wrap this functionality in a specific class created for the purpose of making a purchase, then the construction of the internal objects in the application are centralised and consistent regardless of the type of purchase.
This type of abstraction hides away some of the complicated parts of the process behind a friendly interface that can be used throughout the application with relative ease.
It is this interface that can be described as a Facade. 
class Customer {
  public function __construct(array $details) {}
}
class PaymentService implements PaymentGateway {
  public function createCustomer(Customer $customer) {}
  public function createCharge(Customer $customer) {}
}
class Mailer {
  public static function send(string $to) {}
}

class PaymentFacade 
{
   public static function purchase(array $customerDetails, Item $item)
   {
      $customer = new Customer($customerDetails);
      $service = new PaymentService;
      $result = $service->createCustomer($customer)->createCharge($customer, $item->price);

      if ($result) {
        Mailer::send($customer->email);
      }

      return $customer;
   }
}
For me, the Facade Pattern was a bit confusing so I took some time to figure out exactly why and when it was used. To really assist in learning about design patterns in software, I would recommend reading popular projects source code so you can see how certain patterns are used - then you’ll be able to identify it in your own code. 


The problem with Elixir Umbrella Apps
Fri, 03 May 2019 18:00:00 +0000
Umbrella apps are big projects that contain multiple mix projects. Using umbrella apps feels more like getting poked in the eye from an actual umbrella. 
There are a few misconceptions about umbrella apps surrounding their purpose, how to effectively manage a growing project and deploying the app somewhere in production. I’d like to present a case for not using an umbrella app, if you were considering doing so.
A typical umbrella app
Umbrella apps are meant to help developers split different concerns into different apps thinking that a parallel can be drawn to a Service Oriented Architecture (i.e micro-services). An umbrella app can be thought about in terms of a single application, and in most cases is deployed as such. 
Let’s think through an example app:
In an application that has a web component serving traffic, type and resolver definitions for a graphql API, data contexts for interacting with the database, we could potentially have three apps in our umbrella:

web
graphql
data

We can still deploy this as a single project given the wonders of an umbrella app, and get some minor conveniences with config and testing. When a request comes in, the web app handles it, calling a resolver function in the graphql app, which in turn retrieved some data from the data app. The dependencies are in a single order: web -> graphql -> data, so when you compile it starts with the inner most dependency and works it’s way back, as you’d expect - however if everything is deployed together you can still technically access modules that are technically circular dependencies, which kind of breaks the separation concept. 
What’s the problem with this application?
The main issue with this architecture is that the apps aren’t really split for the right reason. In a growing (in terms of code added over time) application it will most likely slow you down the more code you add as the boundaries become more brittle and blurred.
The reason for this effect is that umbrella child apps are intended to be created as a way to deploy each of them separately, hence the individual configuration and mix project. So unless you’re deploying the apps separately, there is no benefit from using an umbrella app.
There may come a time when you need to, but I can guarantee moving into an umbrella app configuration, retrofitting on an existing app is the easier option than consolidating child apps.

Umbrella child apps are intended to be created as a way to deploy each of them separately

A better alternative
I’m not advocating to never use umbrella apps, but I think in most cases it’s better not to use one until you have the requirement to deploy a child app separately. 
The alternative is to create a good old elixir application using ‘mix new’ and placing all of your apps into their own folders. You can still accomplish the same architecture without using an umbrella app and as a side bonus you’ll be able to quickly iterate and change your mind on decisions as you learn more about your business domain and perhaps elixir too!
This is a much easier way to get started with Elixir, in fact Phoenix recommends to structure your apps in this way through contexts.
My experience with umbrella apps has mostly been one of trying to reduce its complexity and favour modules over apps.
In fact, in Elixir's official documentation where it explains some of the benefits of using Umbrella apps, it does state a disclaimer:

While it provides a degree of separation between applications, those applications are not fully decoupled, as they are assumed to share the same configuration and the same dependencies.

Maybe your experience has been different? If anyone has found success with umbrella apps I’d love to discuss it! 


Building Software with Broken Windows
Sun, 14 Apr 2019 09:00:00 +0000
Ever get the feeling that adding this "one little hack", a couple of lines of code, won't have much of an impact on the rest of the codebase? You think nothing of it and add it, convincing your team members it was the correct decision to get this new feature over the line. In theory, and generally speaking, I would kind of agree with doing it, but every hack is different so it's hard to paint them all with the same brush. If you've been doing software development for long enough you can see this kind of code coming from a mile away. It's the kind of code that can haunt your dreams if you're not careful.
Back to the point, the code you added that was a little sub-par has introduced the possibility for a second hack to be added without the same reservations or questioning from team-members that you might have had before. A similar decision was made last time so we can let this one slide. You may even go so far as to add a comment detailing the hack, and the reasoning, patting yourself on the back before merging it in.
This kind of attitude can really add up quickly, and without you even realising. I would classify this as the go-to technical debt example - the debt being the block of code you anticipate will need re-writing for one reason or another. Over time, you introduce code like this that isn't performant as it should be or wasn't written in a way that is extensible. Tech debt should be used like a bandage, a temporary fix to stop the bleeding, but left for too long, and it starts to bleed-through.
At some point you will have to repay this debt, and figure out a way to remove the code you or your team has added to get a "quick win". Some are easier and more straight-forward than others. When you're adding code, a good rule-of-thumb could be: "how easily can this be removed?". It is the removal of code that is taken forgranted. We assume code we write will live on for a long time, but in reality things change often, so we need to be able to move code around, delete it, or completely re-write it. Easy deletion of code should be the mark that something was created well, and isn't tangled in between many other files or functions.
To the detriment of the codebase, should hacks add up over time and you find yourself piled on with tech debt, it can make the attitude towards the codebase change. This effect is known in software development as Broken Windows, where seeing something that's already broken or poorly formed devalues your own opinion of it, so you either leave it broken or make matters worse by breaking more windows.
In this metaphor your codebase is a house, and you and your team live in this house. When you add a hack, it's like breaking a window. The first one you might patch up to stop the cold air getting in. Not patching it, however, will open the floodgates for more broken windows. Soon, you'll have three or four in your house. When a door handle inevitably breaks under the pressure of heavy usage, after seeing all of the broken windows, you'll probably just leave the door open and not close it anymore, rather than fixing it or buying a new handle. 
How did we get here?
When your codebase is an unmaintainable mess, it's bad for business, it's bad for you (you have to keep fixing it) and it can make others in your team quit if it doesn't get better.
It might seem a bit dramatic, to go from a simple hack to all those bad side effects, but it wasn't the hack itself, it was the attitude that ensued as a result.
Unchecked, these decisions can pile up over time without realising.
How do we fix it?
The first step is knowing you have a problem, just like any other. Identifying problem areas in your codebase, places where nobody dares go until they are forced to add a new feature. Sounds familiar, right?
Instead of taking time out to refactor parts of the codebase for the sake of it, which I might add is much harder to convince anyone it's worth doing now, versus later, I would recommend waiting until you have a feature that needs to be implemented in that area, or could benefit from its refactoring. This ammunition can help you prioritise the refactoring ahead of the feature work itself, if building the feature will be easier. Think of it like an investment.
Planning the redesign of the software with the feature itself, means when it comes time to add the feature, it should be a piece of cake - assuming the planning and execution has gone well.
Whether it's a random hack or a poorly architected part of the software, you can treat the problems in the same way.
If you're thinking it's too late to refactor and you need to completely rewire, I would urge you to think again, since in my experience it's almost always harder to compeletely re-write, unless there are other factors in play than simply it's bad code.
It can be tempting to want to start again and commit to a new set of guidelines for how you build your software, but in the long run you will eventually need to be disciplined enough in your team to see a problem and fix it rather than needing to start again because it got so bad.
In Elixir, I would argue refactoring is at its easiest when you think about modules and functions, as opposed to hierarchical structures that you might find in Object-Oriented Programming languages. Of course, you can still get yourself into a mess in Elixir with the over-use of OTP features and apparent indirection that can come from Meta-programming with Macros.
In general I have found it easier than most other languages that I have used.
A simple mindset change might be all you need to progress from an unmaintainable codebase to one that is easy to add new features. A popular one in programming is the Boy Scout's rule: "Always leave the camp ground cleaner than you found it", which in relation to programming means you fix something that's broken when you see it - while you're touching that code.
It can also be helpful to take the codebase in the state that it's in now and discuss improvements with your team (or yourself if you're riding solo), and plan for the state you'd like it to be in. When you can agree on how the codebase should look, it's easier to make steps towards that goal each time you write code. Over time, this will pay off with the correct attitude.
Tech debt is a mystical beast that can break companies, teams and software alike. Through understanding of how problems like this arise in software development, it's possible to limit the effect it has.
Note for the reader: Planned tech debt is not an excuse for writing bad code, nor should it happen consecutively across features - you may be in more trouble than you think!


Lonestar ElixirConf 2019 Highlights
Mon, 04 Mar 2019 08:22:00 +0000
Last week was Lonestar ElixirConf 2019 held in Austin, Texas. The conference ran over 2 days and was the first Elixir conference I had been to.
In this article, I will recap some of my personal highlights from the conference, including my thoughts about some of the talks. Before I get into that however, I’d just like to say upfront how great it was to be in a room full of Elixir enthusiasts of all levels of experience. Some people were there to find a way to help sell their organisations on Elixir, and others had helpful insights into running Elixir in production. The conference was really well organised and the schedule allowed for plenty of breaks between all of the awesome talks.
Nerves
Opening the conference was a keynote delivered by Justin Schneck, author of Nerves Project. Embedded systems are really taking off in the Elixir community and Justin made it clear to see why, as he showed how easy it was to get started with Nerves, and how to use NervesHub, which is a tool that allows you to manage firmware updates to physical devices, making deployments easy and secure.
While I haven’t done any work with Nerves yet it did make me interested to find a side project I could work on to give it a try.
Most of my work at Vamp is on the web so it’s unlikely there will be a need for embedded systems, but it speaks to the flexibility and uniqueness of Elixir that something like Nerves can allow anyone to get started working with real devices.
Distributed State Management
A hot topic in the Elixir community revolves around distributed state management and common pitfalls, paired with potential solution.
We all know Elixir is great at concurrency and provides a programming model that makes it much simpler to think about but what makes Elixir really shine is when you add extra nodes as the application scales.
This was the subject of the talks from the first morning of the conference with both describing the complexities involved in great detail.
The talks didn’t really offer a specific solution (although they mentioned the likes of Swarm - a distributed process registry) but instead referred to trade offs that anyone facing these problems will have to make, most notably CAP theorem and the balancing scale of Consistency and Availability, given that there will always be a Network Partition in a distributed system.
I really enjoyed this section, so it was a great start to the conference.
Ecto
With the recent split of the Ecto library into two parts: ecto, and ecto_sql in an effort to make it more visible to developers that Ecto can be used without a database, it was only fitting that there would be a few talks about Ecto. I particularly enjoyed Greg Vaughn's "Ecto without a DB", in which Greg presented  practical examples of using Ecto Changesets to validate external data, mapping to structs and applying certain actions to achieve the same validation you would expect with Repo callbacks such as insert/1 and update/1.
Generally, this approach seemed to highlight the fact that having data structures in your application instead of ad-hoc maps in domain logic makes handling errors easier and prevents messy code.
The Business Case for Elixir
Brian Cardarella (CEO of Dockyard) presented his views on the business case for Elixir, specifically referring to 4 main points (paraphrasing):

Stability: The stable releases in Elixir coupled with the plans not to release a 2.0 version of the language any time soon, means that developers can have confidence that code they write today will be able to stand the test of time.
Efficiency: Developer productivity with a language is very important, especially for start ups who need to get to market with new features quickly. Elixir’s low cognitive load when working with modules (groups of functions) means parts of the system can be changed more easily than in other languages where you might need a more wholistic understanding of the application’s code base.
Scalability: Elixir is known for its ability to scale with minimal effort, at minimum cost. This makes it a very attractive solution for smaller teams.
Tractability: Elixir’s popularity is on the rise and Brian expects that by 2020 we’ll be seeing many more companies using Elixir in production.
Overall, Brian equipped those who want to bring Elixir into their own organisations with the right talking points to get the job done.

Phoenix LiveView
Although it had already been announced (but not released yet), Phoenix LiveView was presented to the audience at Lonestar ElixirConf with a promise from Chris McCord to be released as early as the end of the month but at least in the coming months.
Chris spoke about the motivations for building LiveView and stressed the goal of delaying the inevitable single page application path for as long as possible. How long that is will be determined after release when people have had time to use it.
I am personally quite optimistic about it and although I'm happy to keep writing JavaScript on the frontend whenever I need to, it will make building prototype apps to showcase Elixir's real-time capabilities much easier.
The conceptual programming model for LiveView is very similar to that of React and other JavaScript libraries, in that each component has a parent-child relationship, with the default behaviour that if a child component fails, it can be restarted back to its last known state. It is in this way they behave like children in a supervision tree. The similarities between frontend view libraries like React and Elixir/Erlang supervision Trees is a topic I have written about before.
Erlang Ecosystem Foundation
In Jose Valim’s keynote, which capped off the presentations for the conference, he introduced the EEF (Erlang Ecosystem Foundation) as a new organisation run for the community of Elixir and Erlang (and any other languages running on the BEAM (the virtual machine on which Erlang runs). It’s goal is to procure funding for projects within the community to help improve the tooling that surrounds Erlang.
Broadway
Jose also presented Broadway, a new library that was released only a week ago. Broadway is an extension of GenStage, which models producers and consumers as stages in a pipeline that ingests and processes data.
The new library is meant to allow for distributed pipelines to operate in parallel.
Rather than adding these features into the Elixir language, the core team prefers to keep the standard API small and closely matching with Erlang.
Wrapping up
It’s an exciting time to be an Elixir developer as the language has definitely matured over the years without making any drastic changes. In the years to come I hope there will be more success stories from companies using Elixir along with promoting the success coming from developer productivity and happiness building Elixir applications and scaling them.
Lonestar ElixirConf 2019 seemed to be very successful and I certainly enjoyed being there.
I would like to thank Vamp for sponsoring me to make the journey from Australia and hope to see more Australian companies take part in the global Elixir community.


Using a GenServer to handle asynchronous and concurrent tasks
Fri, 01 Feb 2019 18:00:00 +0000
In most cases I have found inter-process communication to be an unnecessary overhead for the work I have been doing. Although Elixir is known for this (along with Erlang), it really depends on what you’re trying to achieve and processes shouldn’t be spawned just for the fun of it. I have recently come across a scenario where I thought having a separate process be responsible for performing concurrent and asynchronous jobs would be the best way to approach the problem. In this article I will explain the problem and the solution. 
Requirements
The goal of this work was to asynchronously handle requests to move static assets from one provider to another. This means downloading the original to a temporary file on the server, then uploading it to the new provider and saving results in a database. 

A GraphQL mutation needs to trigger this asynchronous job and not block the response. 
When the job completes, either successfully or with a failure, we should report it or handle it in some way. 
Multiple requests will come through concurrently, meaning the process shouldn’t be blocked from handling another request because one is still running. 
A request may trigger one or many jobs

The process of finding a solution
There are many different options for structuring your Elixir applications in terms of the supervision tree - when and where to spawn processes and which type of process suits your use case is often a guessing game until you’ve used them all before extensively. 
My first thought was to use a DynamicSupervisor (i.e Task.Supervisor) and specifically create new supervised processes when the work needed to be done, and on demand.
This didn’t really work how I thought it would because the main process would still block until all the tasks were finished before responding to the initial request. 
The next solution I tried was to send messages to a GenServer, and have it do the work so that the main process could return a response almost immediately. While this got most of the way to solving the problem, a common problem found with using GenServers is that they can only handle one message at a time, so while this solution provides the asynchronous behaviour, it loses the benefit of concurrency. 
The solution that (seems to work so far) I ended up going with wasn’t too far away from the Genserver solution. The only difference being when we schedule a job to be done, it only spawns a Task with Task.async/1, the benefit of which is that it will always send a message back to the caller when it’s finished even if you don’t use Task.await/2. 
As it is a GenServer that is spawning these tasks, it can handle generic messages sent to it quite easily with the handle_info/2 callback. This is where the GenServer handles success or failure states of each task, and processing each result synchronously is not a problem in this case. 
Here's a snippet of the GenServer that spawns these Task processes.
defmodule TaskRunner do
  use GenServer

  @me __MODULE__

  def start_link(opts) do
    GenServer.start_link(@me, opts, name: @me)
  end

  def init(opts), do: {:ok, opts}

  def run(fun) do
    GenServer.cast(@me, {:run, fun})
  end

  def handle_cast({:run, fun}, state) do
    Task.async(fun) # sends a message back to the TaskRunner when completed
    {:noreply, state}
  end

  # handle_info/2 receives generic messages from the Task processes
  def handle_info({_task, {:ok, result}}, state) do
    Logger.info("#{inspect(result)} Job Done.")
    {:noreply, state}
  end

  def handle_info({_task, {:error, reason}}, state) do
    Logger.error("Failed to completed job: #{reason}")
    {:noreply, state}
  end

  def handle_info(_, state), do: {:noreply, state}
end
What's interesting about this code is that it may actually be reimplementing something that already exists in Elixir, that I haven't quite got my head around yet - either way I haven't got a problem with doing it this way as long as it works! Wrapping the spawning of a Task in a GenServer simply provides the ability to "schedule" tasks (as each message is processed sequentially), while responding to the response from each task invidiually.
In theory if we were to send a bunch of messages that get "queued" for processing in the GenServer's mailbox, a problem may arise where if the application terminates, the GenServer will lose all of it's messages and those tasks will be lost. At this point, however, I would prefer to see how much of a problem this turns out to be as there would be various factors to consider.
I’m still not sure if this is going to be the best way to architect this asynchronous, concurrent behaviour, but in the few cases where I’ve thought an OTP approach makes sense I have often found many different ways to solve this kind of problem - which is both a good and bad part of Elixir.


Best practices for integrating with third-party libraries in Elixir
Wed, 19 Dec 2018 09:00:00 +0000
When we think about what an application does, it's typical to think of how it behaves in context of its dependencies. For example, we could say a ficticious application sync's data with a third-party CRM.
The way we think about our application impacts how we make abstractions in our code. If we think about a typical web application, we might have a database, router, controllers and some business logic around how we use our data and show it on the page. In many cases, we need to integrate our app with external API's, third-party libraries and more.
It's critical for most web applications to abstract concepts to make the code both easier to read and change in the future.
In many other languages, we often see interfaces coupled with dependency injection in use to achieve these goals. In Elixir, the "best practice" approach isn't always as clear.
In this article, I will discuss a typical scenario of integrating with a third-party API and detail a potential approach you could use on your next project.
When we start writing an integration with a third-party, we should think about how the rest of the application will use it and how it should behave in certain circumstances. Our goal should be that we can have a single internal module whose responsibility is to interface with the external dependency.
In most cases, you shouldn't need to write any more code when requirements change - you might have to add extra functionality, but your business logic (the code using the internal module) shouldn't have to change too dramatically just because you moved from "Pretty Good CRM" to "Greatest CRM Ever".
That being said, I don't really subscribe to the idea that your code is ever going to be perfect and that you'll have a perfect abstraction around your CRM of choice, such that you could even swap modules at runtime and be able to use both simultaneously. However, I would expect that it's not going to be a particularly painful piece of work that involves rewriting any of your own business logic.
To help achieve a loose-coupling in our system, we can use a Hexagonal Architecture, a fancy way of saying our goal is to push all external dependencies to the edges of our application, separating our core business logic from some of the side effects that might be performed. Typically, this is implemented by wrapping external libraries (dependencies) and only using those wrapper modules throughout the rest of your code base. A good rule of thumb would be to only have one module that represents an external dependency in your code, whether that's an API or a Database.
In Elixir we use this approach already with the Repo module, which maps to an Ecto data store. When we create a module in our app adopting a certain behaviour, we create a wrapper around Ecto's Repo module.
defmodule MyApp.Repo do
  use Ecto.Repo,
    otp_app: :my_app,
    adapter: Ecto.Adapters.Postgres
end
Now, everywhere in our code we use MyApp.Repo rather than using Ecto directly to run SQL commands. There are other reasons we use a Repo in this way, but I find it's a good conceptual model to represent a wrapper module.
How to go full-hexagonal
Imagine a world where the CRM you chose had a supported library written in Elixir, so you thought you'd use that in your application. It's called ExCRM (just go with it). For us to implement a hexagonal architecture, we would need to push this dependency to the boundary of our application, by creating a single module to wrap the behaviour of the library. Now, whenever we want to push something to our CRM, we need to call this wrapper module, rather than the library directly. In doing so, we only ever reference the library in one place and create a consistent interface with the rest of our application, through the wrapper module.
It might look something like this: 
defmodule MyApp.CRM do
  def save(user) do
    user
    |> to_crm()
    |> ExCRM.save()
  end

  defdelegate list_users, to: ExCRM

  defp to_crm(user) do
    %{"name" => user.name}
  end
end

defmodule MyApp.Data do
  alias MyApp.{CRM, Repo}

  def create(user) do
    with {:ok, user} <- Repo.create(user) do
      CRM.save(user)
    end
  end
end
At first this looks like a little bit of indirection, and because it's a contrived example it's hard to see the benefits straight away. 
The effect of writing our code in this way is that it limits the blast radius should things change in the library or our own app's requirements. Through limiting the entry-points for the library in your application, we are able to minimize the impact of any such change. While it takes slightly more effort in the beginning to set up, however when business requirements change, you can change your implementation without refactoring lots of different files within your codebase.
Testing your code becomes easier through using this method as you will only need to mock your internal module, rather than the library itself, in all other parts of your codebase.
Isolating dependencies is not a new concept, and in most other programming languages there are clear examples of how to do this, particularly in Object-Oriented languages such as PHP or Ruby.
It's often thought that functional programming and object-oriented programming are at odds with each other and have very different approaches to solving these types of problems, but they actually have a lot of overlapping concepts. Both approaches have the goal of creating maintainable, bug-free applications, sharing concepts but differing in implementation. 
While in Elixir, we think about transforming data from one form to another as opposed to instances of objects that have state, we can still use similar patterns and adapt them for Elixir. There are more parallels than you might think. 


You might not need a GenServer
Tue, 20 Nov 2018 18:00:00 +0000
When you're browsing your way through Elixir documentation or reading blog posts (like this one), there's no doubt you'll come across a GenServer. It is perhaps one of the most overused modules in the Elixir standard library, simply because it's a good teaching tool for abstractions around processes. It can be confusing though, to know when to reach for your friendly, neighbourhood GenServer.
A GenServer is a generic implementation of typical client <-> server interactions, where the client is a process and the server is your GenServer. This abstraction exists because without it we would have to write a lot more boilerplate code around receiving messages from other processes.
I have grown fond of GenServer's, along with the Elixir community, however there are some circumstances in which you might not need a GenServer.
Let's take a quick look at the differences between a Task and a GenServer and figure out which module fits best.
Task
Tasks are a simple, yet powerful tool to change the way your code is executed and introduce some light concurrency.
You can run a function asynchronously, isolating any failures:
Task.start(fn -> raise "the roof" end)
This line has a couple of advantages:

The Task will run in a new process, leaving you free to do all of those other things you wanted to get done.
Any failure or error raised in the course of running the function, will be isolated to the task's process, so your main process will not stop executing any code after the task.

In contrast to Task.start/1, we can wait for the result, at some point in the future with Task.async/1:
task = Task.async(&my_function/0)
# .. some other code
result = Task.await(task)
By using a Task in this way, we can defer the retrieval of the result and execute some other code in the mean time. Once you've extracted the result, the process will exit.
A Task will only execute one function in it's lifetime and isn't meant to be a long-running process, or be involved in any inter-process communication. The benefit of this is that it's much easier to write for one-off tasks and simpler to test. In most scenarios where you only want to run a function asynchronously, a task will suffice.
Agent
An Agent is a process that abstracts state. If all you need is something to hold a value for a relatively short period of time (in memory), an agent is a perfect option. An Agent is actually a GenServer that has been abstracted into it's own module. So, while you get all the benefits of using a GenServer, you aren't required to set up the client-server interactions you're already familiar with.
If we want to hold a value in an agent, you can store it using Agent.get/3 and store values using Agent.update/3. These functions are already defined for you in the Agent API - functions you would have to define yourself, had you chosen to implement the same functionality with a GenServer.
{:ok, pid} = Agent.start(fn -> "hello" end)
Agent.get(pid, fn state -> state end)
"hello"

Agent.update(pid, fn state -> state <> " world" end)
Agent.get(pid, fn state -> state end)
"hello world"
A timeout is included as the third argument to each of these functions, because they are synchronous and so the caller must wait for the function passed as the second argument to finish executing. If we want to run a function using the state in an Agent, we need to use Agent.cast/2, which works in a similar way to GenServer.cast/3.
{:ok, pid} = Agent.start(fn -> "hello" end)
Agent.cast(pid, fn _ -> "world" end)
:ok
Agent.get(pid, fn state -> state end)
"world"
Final thoughts
As you can see it has almost all of the typical functions you'd see in a GenServer, however you don't need to worry about the process to process communication. You can simply focus on building the functionality of your application.
The rise in Elixir's popularity is partly due to the fact that you can use abstractions around common problems while building your application, and only when you need the fine grained control, the underlying module is there for you to use, and this is especially the case for a GenServer.
There are of course much more complicated problems a GenServer is quite adept at solving than anything I've written here, but the point was not to illustrate how complicated you can make an application, but rather how quickly you can get started with some simpler alternatives, and only use a GenServer when you absolutely need it.


Offset and Cursor Pagination explained
Tue, 30 Oct 2018 08:22:00 +0000
Typically in an application with a database, you might have more records than you can fit on a page or in a single result set from a query. When you or your users want to retrieve the next page of results, two common options for paginating data include:

Offset Pagination
Cursor Pagination

Offset Pagination
When retrieving data with offset pagination, you would typically allow clients to supply two additional parameters in their query: an offset, and a limit.
An offset is simply the number of records you wish to skip before selecting records. This gets slower as the number of records increases because the database still has to read up to the offset number of rows to know where it should start selecting data. This is often described as O(n) complexity, meaning it's generally the worst-case scenario. Additionally, in datasets that change frequently as is typical of large databses with frequent writes, the window of results will often be inaccurate across different pages in that you will either miss results entirely or see duplicates because results have now been added to the previous page.
If we want to get the first page of the newest posts from a database, the query might look like this:
Post
|> order_by(inserted_at: :desc)
|> limit(20)
Then, when we want the second page of results, we can include an offset:
Post
|> order_by(inserted_at: :desc)
|> limit(20)
|> offset(20)
While you could get away with this method initially, and it's definitely worth doing first - as the number of records increases you can consider alternatives to make reading much faster and more accurate.
Cursor Pagination
This is where cursor based pagination comes in. A cursor is a unique identifier for a specific record, which acts as a pointer to the next record we want to start querying from to get the next page of results. With using a cursor, we remove the need to read rows that we have already seen by using a WHERE clause in our query (making it faster to read data as it's constant i.e. O(1) time complexity) and we address the issue of inaccurate results by always reading after a specific row rather than relying on the position of records to remain the same.
Using our previous example, but this time implementing pagination with a cursor:
Post
|> order_by(inserted_at: :desc)
|> limit(20)
|> where([p], p.id < ^cursor)
In order for us to use a cursor, we need to return the results from the first page, in addition to the cursor for the last item in our result set. Using a cursor in this way is fine for moving forward in the result set, but by changing the fetching direction, you add complexity to how you retrieve records.
Conclusion
Cursor pagination is most often used for real-time data due to the frequency new records are added and because when reading data you often see the latest results first. There different scenarios in which offset and cursor pagination make the most sense so it will depend on the data itself and how often new records are added. When querying static data, the performance cost alone may not be enough for you to use a cursor, as the added complexity that comes with it may be more than you need.


Using Protocols to decouple implementation details
Wed, 26 Sep 2018 09:00:00 +0000
Protocols are a way to implement polymorphism in Elixir. We can use it to apply a function to multiple object types or structured data types, which are specific to the object itself. There are two steps; defining a protocol in the form of function(s), and one or many implementations for that protocol.
You've probably seen this example before either in Elixir or as an Interface in other languages:
defprotocol Area do
  @doc "Calculate the area for a given object"
  def area(object)
end

defimpl Area, for: Rectangle do
  def area(rectangle) do
    rectangle.width * rectangle.length
  end
end

defimpl Area, for: Circle do
  def area(circle) do
    :math.pow(circle.radius * :math.pi, 2)
  end
end

# These are arbritrary shape structs, but ignoring that fact, 
# we have defined a protocol and a couple of implementations. 
# Usage is then as easy as:

iex> Area.area(%Rectangle{width: 5, length: 3})
15

iex> Area.area(%Circle{radius: 5})
246.74011002723395
What is Polymorphism?
Source: Wikipedia

Polymorphism is the provision of a single interface to entities of different types.

I think this definition best explains what Polymorhism is in Elixir Protocols, as you define a single protocol that is used as an interface to different structured data types, keeping implementation separate from your calling code.
The goal of Polymorphism is to define abstractions around how types are used in your application, including which operations or functions are able to be performed on them. These abstractions allow your code to be decoupled from implementation details that aren't relevant. 
In Elixir, this means that we can define implementations of specific protocols, and then call the protocol functions on any of those object types, without knowing which object it is at run-time.
Use-cases
A typical situation you might find yourself in is wanting to translate an internal data structure, to an external one, perhaps for use in an API call.
Let's say we want to translate a (contrived) User struct into a ExternalUser struct, but our calling code should be generic so that it can be used to translate other types as well.
# lib/my_app/protocols/external.ex
defprotocol MyApp.External do
  @doc "Transform data from internal objects to external"
  def transform(data)
end

# lib/my_app/user/implementations/external.ex
defimpl MyApp.External, for: MyApp.User do
  @doc """
  For our mythical external API, 
  we only need ID and name of the user
  """
  def transform(user) do
    %ExternalUser{
      id: user.id,
      name: user.name,
    }
  end
end

# lib/my_app/api.ex
defmodule MyApp.API do
  @moduledoc """
  Transform data and push it to an external service.
  """

  @doc "Push transformed data with some options"
  def push(data, opts \\ []) do
    data
    |> MyApp.External.transform()
    |> ExternalAPI.push(opts)
  end
end
We've now decoupled our API pushing service, MyApp.API is not aware of what it is pushing, only that it needs to transform the structured data first before making the request.
Enumerable - you already use a protocol
In case you weren't already aware, if you've been using any Enum functions, such as Enum.map/2 and Enum.filter/2, the data types you pass as the first argument to those functions implement the Enumerable Protocol. This particular protocol defines four functions that need to be implemented for any type you wish to use with it reduce/3, count/1, member?/2 and slice/1. You can see these functions defined in the Elixir code on Github.
One of the greatest things about Elixir is you can easily browse source code to see how the standard library we use all the time is implemented internally. In theory, you can implement your own Enumerable type, but I'm not sure how useful that would be in practice!


Add Docker to Elixir/Phoenix projects in one command
Thu, 23 Aug 2018 09:00:00 +0000
Recently, I've been writing a tonne of Elixir code, some Phoenix websites and a few other small Elixir applications. One thing that was bugging me every time I would create a new project is that I would want to add Docker to it either straight away because I knew there would be a dependency on Redis or Postgres etc, or halfway through a project and it would really slow down the speed at which I could hack something together.
One of the things that I love about Elixir is how quickly you can get started writing an application, whether it's a web app or it's got supervision trees coming out of its ears.
In any case, I found myself going back to old projects where I had used Docker, copying over a few necessary files to get started, with the workflow that I was used to.
exdocker
So I decided to try to fix this problem in an Elixir CLI application (Escript). It really doesn't do much, it writes some files and does some string replacements to make setting up docker easy.
I called it exdocker because I'm not as creative as I used to be.
Installation
Make sure ~/.mix/escripts is in your machine's $PATH. You can do this by adding export PATH=~/.mix/escripts:$PATH to your .bashrc or similar file.

mix escript.install hex ex_docker
source ~/.bashrc - load the escript and $PATH update

Usage

Create a new Elixir project
mix phx.new my_project
exdocker my_project
Add to an existing Elixir project
exdocker my_project or exdocker . to run it in your current directory.

Three files get created in the root of your project:

docker-compose.yml - configuration of your docker containers
Dockerfile - Specify what needs to be installed for Elixir/Phoenix to run
Makefile - Convenient targets for docker-compose commands

You can then run make init shell from the root to build and run Docker containers, and when this command finishes, you'll be inside a shell session with Elixir and Mix installed so you can continue development as usual.
Any other time you need to use it, it will be available to execute with exdocker.
Be sure to let me know if you found this useful!


Working with Tasks
Thu, 26 Jul 2018 09:00:00 +0000
While writing Understanding Concurrency in Elixir I started to grasp processes more than I have before. Working with them more closely has strengthened the concepts in my own mind.
In Elixir's standard library, there's a few modules that abstract common code that without these modules you'd find youself repeating often.
When you want to write asynchronous code, you may care about the result of the code, and sometimes you might not.
The Task module makes it easy, either way.
To better understand how tasks work, I thought I would create a simple (naive) module that would implement a similar API to that of the Task.
Re-implementing the Task module
Consider the following module, which I'm going to call Job.
defmodule Job do
  def async(fun) when is_function(fun) do
    parent = self()

    spawn_link(fn ->
      send(parent, {self(), fun.()})
    end)
  end

  def await(job, timeout \\ 5000) do
    receive do
      {^job, result} -> result
    after
      timeout -> {:error, "no result"}
    end
  end
end
Job.async/1 accepts a single function as a parameter, and this is the work that will be carried out asynchronously. You can either run the function, without caring about the result:
iex> Job.async(fn -> "Hi" end)
<#PID>
It returns a Process Identifier (PID), which is the result of calling spawn_link/1, passing in a function which in turn sends a message to the parent process. We've split up the implementation of async and await so that you can optionalally pass the PID to await if you care to wait for a result.
Let's see what that would look like:
iex> Job.async(fn -> "Hi" end) |> Job.await()
"Hi"
When we pattern match on the job PID to identify the message being received, and the result of the job, the value of the result is the result of invoking the function passed to Job.async/1.
In this case the result was seen instantly, but if it the initial function was actually performing asynchronous work, then it would wait for a timeout period to elapse before giving up. This is the after section of the await function.
iex> Job.async(fn -> 
  :timer.sleep(6000)
  "Hi" 
end) 
|> Job.await(5000)

{:error, "no result"}
We got an error because the timeout had elapsed, given the timer in the function paused processing until 6 seconds had gone, whereas the Job.await/2 function gave up waiting after 5 seconds.
Conclusion
Hopefully the Job module helps your understanding of what the Task module is doing under the hood, to some degree, it is not the full implementation and there's a whole lot more that come with using tasks, such as process supervision, streaming, and more. That being said, it can be useful to become familiar with passing messages between processes, in any case.


Understanding concurrency in Elixir
Sat, 14 Jul 2018 09:00:00 +0000
Concurrency in Elixir is a big selling point for the language, but what does it really mean for the code that we write in Elixir? It all comes down to Processes. Thanks to the Erlang Virtual Machine, upon which Elixir is built, we can create process threads that aren't actual processes on your machine, but in the Erlang VM. This means that in an Elixir application we can create thousands of Erlang processes without the application skipping a beat.
One function that enables Elixir developers to create processes is spawn/1. Spawn takes a single argument, which can either be an anonymous or named function and will create an isolated context inside a new process for the function to be run. Typically, when we invoke a function it is run in the main process thread with all of the rest of your code. There are two things to be aware of when doing this:

When running application code within a single (main) process, if your code fails due to a bug or otherwise, it will stop the rest of the application from responding, and will be in a crashed state.
The process thread which is currently running your code, will be blocked until the execution of the function completes. This means that it's blocking other code from running, and is synchronous.

Let's break down each of these points to understand their meaning.
Let it crash
In Elixir, a common turn of phrase is to "let it crash" - it being the current process - and if you're just coming to Elixir from another language, as most people are, it can be confusing to understand exactly what this means. When we follow the "Let it crash" principle, it should always be a separate process so that other parts of the application are unaffected. When we use the Phoenix Framework, each HTTP request is handled in a separate process, created for a single purpose. If your application needed to serve thousands of requests simultaneously, then Phoenix (and by extension Cowboy - an Erlang-based HTTP server) would create thousands of requests, each in complete isolation.
Doing this means you can crash the current process, i.e. a single HTTP request and it would not affect the rest of the application. 
Similarly, if we have an application that is not in a web context, we can create a supervision tree to handle any failures. The added benefit of using a supervision tree is that you can also determine a strategy for restarting any child processes based on the purpose of said processes. Structuring an application in this way, means that you can isolate failures, which is the purpose of letting things crash - because if they're not affecting the main process thread, then it can be handled appropriately.
Asynchronous Elixir
To demonstrate asynchronous elixir, it's important to understand what typically happens with your code when it is executed synchronously. Think about enumerating over a list:
Enum.each(1..10, fn n -> IO.puts n end)
When this code runs, the process in which it is running is blocked until it is finished enumerating over the list. You can see this more clearly by changing the range 1..10 to 1..10_000_000 and running it inside an iex shell. You'll notice that you can't do anything else in that process until it's done enumerating. This is code executing synchronously.
Asynchronous code can be particularly useful if you have large amounts of work that can be done concurrently. To do this in Elixir we can use the spawn/1 function to create a new process in which to do the work. When application code executes inside of a process, it can run without blocking any code in other processes.
Similar to the previous example, we can enumerate over a list but this time we'll execute the output asynchronously:
Enum.map(1..10, fn number -> 
  spawn(fn -> 
    IO.puts number
  end)
end)
You'll notice when you run this code the numbers aren't output in order like they were in the synchronous example. This is because each process is started and executes in an independent order to any others. 
This is great when all of your code works perfectly, but in the real world, you will have to expect there to be some failures, so to replicate this real-world scenario, we can raise an exception to illustrate something not executing correctly.
Enum.each(1..10, fn number ->
  spawn(fn ->
    if rem(number, 2) == 0 do
      raise "the roof with number #{number}"
    else
      IO.puts(number)
    end
  end)
end)
When this code runs it will raise an exception for all of the even numbers within the 1..10 range. We can see however, for all the odd numbers, the code executes correctly and outputs the number. In a larger context this would mean that failures are not affecting the main process where the application is running, and that any failures within any child processes are also not stopping anything in the main process, so any other code can continue to execute.
In a real world application, you might want to handle any cases where a process does crash, and thankfully there are a few constructs built in to Elixir that abstract away some of the necessary code to send and receive messages that you would need to handle success and failures in processes with spawn/1. One such construct is the Task module, which is perfect for once-off asynchronous tasks, as we were doing earlier. In particular, the async/1 and await/2 link the calling process with the new one created in Task.async/1.
There are many other possibilities using Tasks, and I think they're great for getting started working with processes in Elixir.


Composing Ecto Queries
Fri, 06 Jul 2018 09:00:00 +0000
Ecto is an Elixir library, which allows you to define schemas that map to database tables. It's a super light weight ORM, (Object-Relational Mapper) that allows you to define structs to represent data. 
When I was first learning how to use Ecto and Elixir itself, I was amazed by the fact that you can compose queries in the same way you can compose functions. Given Elixir is a functional language in which pipelines play a big part, it's easy to see why it's such a nice way to express queries. 
To start composing Ecto queries, you can import the Ecto.Query module.
This query will get all albums that have been released: 
query = where(MyApp.Album, released: true)
This will return an Ecto.Queryable type, which you could pass straight to a Repo (a module that handles connections to the database) using Repo.all(query), or you can add to it: 
# using our previously defined query
released_with_length = where(query, [q], q.length > 20)
We are able to create a whole new query based on the existing one above. If we now pass this to Repo.all we would get all released albums longer than 20 minutes.
You may have noticed in the first query we started off using the Ecto.Schema we had defined, and in the second example we used the first query. That's because both of these structs implement the queryable protocol, essentially letting Ecto know we can use it to query for data. 
Queries with joins
With great queries comes great responsibility, fortunately Ecto makes it easy to do joins without breaking a sweat.
Let's say we also have a songs table, and each record has an album_id to relate it to an album.
If we wanted to get a list of albums, where the songs in that album are longer than a certain number of seconds, we could do that with the following query:
@doc """
Find albums with songs longer than `length`
Includes all songs in that album, with Repo.preload/3
"""
@spec find_albums_with_songs_longer_than(integer()) :: list(Album.t())
def find_albums_with_songs_longer_than(length) do
  Album
  |> compose_albums_with_song_length(length)
  |> distinct([a], a.id)
  |> find_all()
  |> Repo.preload([:songs])
end

defp compose_albums_with_song_length(queryable, length) do
  queryable
  |> join(:inner, [album], song in Song, album.id == song.album_id)
  |> where([_, s], s.length > ^length)
end
There's a few things going on here, but the main part is using a function to join on the songs table and scope the query for albums to return only the ones with songs where they are longer than a certain integer.
This pattern is useful for abstracting lower levels of a query into smaller parts, so you can join them up in a function that has a bit more context. Typically, you might have done this before with functions, but each function call would itself have gone to the database and you'd use an enumerable to filter results.
This type of composition is made possible through Ecto query bindings. These are the references to schemas that have been added to a query, in a list ordered in the same way in which they were added.
The order matters in query bindings, which can make it difficult to do multiple joins across different functions in the same way we split our query out into functions before.
Sample application - try it out for yourself
I built a small application to show how this all works together in an application. I would encourage you to clone it and check it out. There's not a lot of resources out there to get started working in Elixir but this application might show you how to get something simple working, while also showing some deeper examples of how powerful composition is in Ecto and Elixir in general.
It has tests as well, so that you can make changes to the queries and run mix test, to see if you broke anything.
So here it is: Composing Ecto Queries on Github


Streaming large datasets in Elixir
Wed, 27 Jun 2018 21:34:00 +0000
We often think about Streaming as being the way we watch multimedia content such as video/audio. We press play and the content is bufferred and starts sending data over the wire. The client receiving the data will handle those packets and show the content, while at the same time requesting more data. Streaming has allowed us to consume large media content types such as tv shows or movies over the internet.
A Stream in the Elixir sense of the word, is a composable way to lazily evaluate transformations on collections. When managing large datasets, traditionally you would load all the records into memory, say from a database query, and use the Enum module to apply various transformations with each call to an Enum function. With Streams, you can call Stream functions in a composable way, but only when Stream.run/1 is called or it's converted to an enumerable does it actual perform those computations.
When we create a new Stream by calling one of the many functions on the Stream module, for example Stream.map/2, we pass it an enumerable and a function, which we want to be applied lazily. We can see the Stream that is returned, keeps a reference to the original enumerable: enum and the function(s): funs we want applied. 
It's only when we convert the Stream to an Enumerable, that the functions run against the enumerable, and we have our result.
# Create a new stream
> [1, 2, 3, 4, 5] |> Stream.map(&(&1 * 2))
#Stream<[
  enum: [1, 2, 3, 4, 5],
  funs: [#Function<48.71542911/1 in Stream.map/2>]
]>

# do some more code things

# Ok, let's evaluate the stream by converting it to an enumerable
> Enum.to_list(stream)
[2, 4, 6, 8, 10]
Why should you use a Stream?
There are three advantages I can see with using Streams:

Functions can be lazily evaluated and thus built up over time, until the stream is finally converted or run.
Large datasets can be split into smaller chunks, reducing the amount of memory needed to consume them.
Streams encourage function composability without needing to write complex code in an Enum.reduce.

These advantages are a bit easier to describe in code:
defmodule StreamOrNotToStream do
  @doc "Without Streams, we enumerate over the range 3 times, every time we call Enum.map/2"
  def without_stream(enumerable) do
    enumerable
    |> Enum.map(&(&1 * 2))
    |> Enum.map(&(&1 + 1))
    |> Enum.map(&(&1 - 1))
  end

  @doc "With Streams, we build up all of the transformations and enumerate only once!"
  def with_stream(enumerable) do
    enumerable
    |> Stream.map(&(&1 * 2))
    |> Stream.map(&(&1 + 1))
    |> Stream.map(&(&1 - 1))
    |> Enum.to_list()
  end
end

> StreamOrNotToStream.without_stream(1..100) # fun fact - Ranges are also Streams!
[2, 4, 6, ...]

> StreamOrNotToStream.with_stream(1..100) # same result - but we got there differently
[2, 4, 6, ...]
You can see how having large datasets, and enumerating over the entire list for each transformation would be more expensive. With Streams and its lazy evaluation, we can defer getting the value until it's needed, which means it can reduce the time spent doing potentially expensive calculations.
Streams are a powerful concept that allows you to efficiently manage even infinite datasets through encouraging composition of functions. A Stream is a handy substitute for what might otherwise be a complex Enum.reduce/3 function. Using a Stream not only cleans up your code, but will give you a clearer mental picture of the transformations happening on the data.
Composing functions is what Elixir is good at, Streams allow you to still break up the data transformations, and perhaps even do them at separate times - this wouldn't be easy in a reducer function.
Working with Ecto (or other data sources)
Streams can be really powerful when using them with a database, specifically with either Repo.stream/2 or Stream.resource/3. The latter is a bit more generic so we'll use that as our example.
With Stream.resouce/3 you can chunk your dataset into specified amounts, and emit them through a stream. It allows you to keep track of the last record that was seen through an identifier, and pick up where it left off for the next chunk. All you need to do think about is what transformations to apply and when to evaluate them.
We could apply these concepts for other data sources, not just Ecto or even a database. This could be used to receive results from an API that uses pagination to move through the data.
Using Streams, we can compose functions and push them on to a stack until such a time that we're ready for Elixir to evaluate the result of all of the functions together. This is a powerful concept and I'm looking forward to doing more with them in the future.


A Queue is just a Q with 4 silent letters
Wed, 06 Jun 2018 09:20:00 +0000
A Queue is a collection data structure, which uses the FIFO (First In, First Out) method. This means that when you add items to a queue, often called enqueuing, the item takes its place at the end of the queue. When you dequeue an item, we remove the item from the front of the queue. 
Both of these methods will change the length of the queue. A peek method can be implemented to look at what the first item is in the queue, without removing the item, leaving the queue unchanged.
Queues are often implemented with list data structures, such as a Linked List. In Elixir, lists are singly-linked lists under the hood. We’re able to access the head and tail of a list, which refer to the first item in the list and the rest of the list, respectively.
[head | tail] = [1, 2, 3]
> head
1
> tail
[2, 3]
If you think about implementing a queue in Elixir, we would need to implement the following methods:

Enqueue
Dequeue
Peek
Count

Let’s look at each of these individually.
Enqueue
When adding an item to a list in Elixir, it’s common to prepend to the list, then reverse it when accessing it to preserve order. As lists are singly-linked in Elixir, it is much faster to add to the front of the list, rather than adding to the end and having to re-create all of the links in the list.
For this reason, you could implement an enqueue function in this way:
@spec enqueue(list(), any()) :: list()
def enqueue(queue, item) do
  [item | queue]
end
Dequeue
Now that we have items in a list, we need a way to remove one when we call dequeue. I started having a look at how Elixir deletes items from a list and found List.delete/2 — we can see a few function heads there, but here are the two lines you need to appreciate:
def delete([item | list], item), do: list  
def delete([other | list], item), do: [other | delete(list, item)]
The first argument is the list, and the second argument is the item to be removed. Elixir binds the second argument name as the same value as the head of the list, and if this function is called, it returns the tail (thus removing the item from the list). Otherwise, if the two item variables are not a match, the head is prepended and delete/2 is recursively called on the tail.
That might be a bit to take in, but I recommend trying it out in an interactive Elixir shell iex.
Although we don't need List.delete/2 in this case, we can implement a dequeue function like so:
@spec dequeue(list()) :: {any(), list()}
def dequeue([]), do: nil
def dequeue(queue) when length(queue) <= 2 do
 [item | tail] = Enum.reverse(queue)
 {item, tail}
end
def dequeue(queue) do 
 {Enum.at(queue, -1), Enum.drop(queue, -1)}
end
We return a tuple in this function, because we want to know both the item that was dequeued, and the remaining items in the queue (so we can enqueue more items later). It’s worth noting as well that this is not the most efficient solution, as it is using an O(n) algorithm because the Enum methods we’re using are always going to enumerate of the list to get the last item.
Peek
A peek function is simply a utility to allow looking at the front of the queue, without changing the queue itself. Although, you might want to add some extra function heads to cater for empty lists.
@spec peek(list()) :: any() | nil
def peek([]), do: nil
def peek(queue) do
  [h | _ ] = queue
  h
end
Count
Similarly, count is the number of items still in the queue, and can be implemented as such:
@spec count(list()) :: integer()
def count([]), do: 0
def count(queue), do: length(queue)
These functions are all fine in theory, but when we start to think about implementing a queue in Elixir, we can’t wrap this up in a class that knows about it’s own state. Instead, we could implement these functions as part of a GenServer, which will hold it’s own state and can be updated over time.
Priority Queue
When simple FIFO doesn’t cut it and you need to be able to process items in a queue before others we can implement a Priority Queue. This means that when an item is enqueued, it doesn’t necessarily go to the back of the queue (or front of the list in Elixir), each new item needs to be compared with other items until we find a suitable place for it based on its priority.
Priority could mean integer values, for example the number 10 would have a higher priority than 5, because it is the higher value. Imagine the following queue:
head -> 7 - 3 - 1 <- tail
If we base priority on the higher integer values, and we add 10 to this queue, we would expect it to take priority over all other values because it is highest. So we’re left with the following queue:
head -> 10 - 7 - 3 - 1 <- tail
If the priority of the new item was not higher than any of the existing items, it would simply be added to the end of the queue.
Queue’s have a variety of real-world applications, such as scheduling asynchronous work or handling large amounts of requests. High priority requests can be processed first, and lower priority processed later.
I decided to implement a simple Queue library using a GenServer, and optional priority, you can take a look at the documentation or go straight to the code


Composing Elixir Plugs in a Phoenix application
Fri, 23 Mar 2018 09:00:00 +0000
Elixir is a functional language, so it’s no surprise that one of the main building blocks of the request-response cycle is the humble Plug. A Plug will take connection struct (see Plug.Conn) and return a new struct of the same type. It is this concept that allows you to join multiple plugs together, each with their own transformation on a Conn struct.
A plug can be either a function or a module that implements the Plug behaviour.
What is a Plug?
Before we get to know Plug, add it as a dependency to your project’s mix.exs file: {:plug, "~> 1.5"} and run mix deps.get to install it.
A Plug has the following structure:
defmodule MyApp.Plug.AuthenticateUserSession do
  import Plug.Conn

  def init(options), do: options

  def call(conn, _opts) do
    put_session(conn, :user, %{name: "Jack"})
  end
end
In this example, we’re adding a :user map to the current session data. The two main functions are init/1 and call/2 both of which are part of the Plug behaviour, defined in plug.ex.
init/1 is used to initialise the plug with options that can be used in the call/2 function. call/2 is the meat of the plug, where you take a %Plug.Conn{} struct and return a new one.
How to use a Plug
You can use a Plug in a your application’s router. It might look something like this:
defmodule MyApp.Router do
  use Plug.Router

  alias MyApp.Plug.AuthenticateUserSession

  plug AuthenticateUserSession

  get("/", do: send_resp(conn, 200, "Welcome\n"))
end
Now, when you visit / in your application, it will run AuthenticateUserSession.call/2 and as we defined earlier, it will add a user map into the session data.
How to combine multiple Plugs
When building an application, you might need to do more than just adding a user to a session. Rather than extending upon the AuthenticateUserSession plug, you can create a new plug that will do the specific job you need it to. This allows us to compose plugs that are discrete and makes sure we’re following the Single Responsibility Principle.
With Elixir plugs, we can combine multiple plugs in a :pipeline that can group functionality together.
defmodule MyApp.Router do
  use WebRoot.Web, :router
  pipeline :auth do
    plug AuthenticateUserSession
    plug AuthoriseUserSession
  end
  scope "/", do
    pipe_through :app
    get "/page", MyApp.PageController, :index
  end
end
Order is important when using a pipeline, as plugs will be called in the order that they are defined. Now, in MyApp when /page is requested, AuthenticateUserSession.call/2 will be called, which can either return a %Plug.Conn{} and let the request continue to the next plug, or it can halt the request and return a different response. When the former happens, the AuthoriseUserSession.call/2 function will run. Finally, if this plug returns successfully, MyApp.PageController will be called and the user can receive a response.
The Plug concept has many applications beyond just the request-reponse cycle. It stands as a metaphor to explain how Elixir applications can be built up with many parts, all performing their job and handing off to the next module.
You can read more about plugs on the docs of the Plug package.


A Comparison of Elixir Supervision Trees and React Component Trees
Tue, 06 Feb 2018 09:00:00 +0000
A Supervision Tree in Elixir has quite a number of parallels to how developers using React think about a component tree. In this article I will attempt to describe parallel concepts between the two - and if you've used React and are interested in functional programming, it might prompt you to take a look at Elixir.
Before we get started you'll need to know that Supervision Trees are not necessarily a concept that was born out of the development of the Elixir language, but form part of a concept known as OTP (Open Telecom Platform), coined by the creators of the Erlang language.
Hopefully I haven't lost you yet...take a look at this picture of an actual tree to refresh, and then come back.
Isolating Failure
One of the main building blocks in OTP is isolating processes so that they act (and fail) independently. When a new process is spawned in Elixir, it is common to monitor it with a Supervisor, so that if an error happens, the reason can be logged or sent to an error reporting service. The parallel in React, which we can find in the conceptual model of the React component tree is where a (Parent) Component renders one of its children, it can catch the error with componentDidCatch and similarly log or send an error report.
Message/Data Flow
In React Component Trees, the flow of data is one-way, from parent to child(ren). The parent component can also pass functions as props, which would enable the child component to respond back to the parent. The parent can then handle this callback by setting a new state, and consequently, it may re-render its children.
In an Elixir Supervision Tree, a child process can be linked to the parent process, allowing the parent to be sent a message when something happens, for example, when the process finishes what it was doing. A common scenario might be that a process could spawn a Task, which on completion could (depending on how it is spawned) send a message back to the parent process for it to be handled appropriately.
Guaruntees with a Tree structure
A tree structure makes sense when we think about UI, so that we can predictably control the way in which data flows through an application, allowing us to make certain guaruntees about our components. You might have heard of this being described as React being "easy to reason about".
Elixir Supervision Trees also utilise the tree structure to make guaruntees around availability and isolation - key concepts as part of OTP. A supervision tree isolates each node and set of nodes so that it can both easily recover when things go wrong (restarting processes - isolation of failure) and to keep the rest of the nodes in the tree unaffected by the system failure. You can think about this like branches in an actual tree - when a branch on a tree dies, it can be cut off and the rest of the tree will attempt to regrow the branch.
Similarly, in a React Component Tree, as I mentioned earlier, errors can be caught with componentDidCatch lifecycle method - and one day a hook - at various points in the tree to stop the whole page from crashing, making the entire page unusable. Instead, only one branch or set of components in the tree won't be able to render correctly, or shows an error state, but it keeps the rest of the application working as if nothing happened.
If you still have no idea why you would use a Supervision Tree in Elixir or how it could possibly relate to a UI library - I'm sorry, that's all I've got.


Surviving technical debt in the real world
Thu, 21 Dec 2017 12:18:00 +0000
Technical debt is a potentially crippling disease that can take over your codebase without much warning. One day, you’re building features, the next, you struggle to untangle the mess you (or maybe your team) has created.
The good news is you’re not alone. Engineers everywhere accrue technical debt, the trouble is when you never do anything to get yourself out of debt, you just keep smashing features out into the dark abyss of GitHub.
You can train yourself not to take a shortcut all the time, and start to learn the benefits of technical debt when used properly. All it takes is a bit of willpower. Yes, the same thing you use if you want to shed a few kilos. When you take short cuts all the time, it’s hard to bust out of that groove and future proof things.
Does it even work?
Always be testing your code. Even when you write code that you can see is bad, write a test so that when you refactor it you can have a level of confidence it’s still going to work.
You’re not in control of all situations
There might come a time when you’re pressed to get things done now. When that happens, get it done and take a cheat day. There’s nothing wrong with this, after all you are not your code and businesses want things done yesterday.
Just try to record you took the shortcut perhaps with a Fixme comment and plan to refactor.
Bugs are bad, mmkay
Be okay with pushing bugs to production. Bugs are not created equally, so of course don’t be okay with breaking the application, but small bugs are going to happen. It is not a measure of your expertise as a developer that you missed it. Rather, I would measure a person’s commitment to fixing said bug rather than producing it in the first place.
Catch it in code review
Code review is an effective exercise for both reviewer and author. If something is a short cut, make sure the author knows. On the flip side, as an author, make sure you let reviewers know the code could be improved but you will fix it later. Ownership of tech debt is important, because in a team it can often get lost as nobody’s problem until you all have to deal with it.
Don’t overuse technical debt as an excuse to write bad code
Use technical debt sparingly. If you are spending more time justifying why the code you’re writing is technical debt than writing code. I have bad news for you.
It ain’t technical debt.
It takes a lot of self control and willpower not to take the easy road, especially when you discover someone else’s technical debt, which they have not bothered to refactor.
It’s a skill, which once mastered, means your coworkers will thank you.
I am by no means an expert at not writing or fixing technical debt, but it has become a satisfying effort for me to fix what was once broken or difficult to understand.
If you’ve read this far, you probably have enough willpower to fix tech debt already…


Elixir Pattern Matching in a nutshell
Tue, 15 Aug 2017 20:00:00 +0000
Before being introduced to Elixir, a functional programming language built on top of Erlang, I had no idea what pattern matching was. Hopefully, by the end of this article you will have at least a rudimentary understanding of how awesome it is.
In most programming languages, you will assign a value to a variable using something like:
const myVariable = 'my value';
console.log(myVariable); // 'my value'
Now, myVariable is bound to the value you assigned to it and you can continue living your life.
When you need to check the value of a variable, in most other languages you would use conditional “if statements”, which can get unreadable as soon as you add more than 2 or 3. This is because it’s difficult to see the flow of logic, especially if the function spans many lines.
Technically you can do the same thing in Elixir, but how the compiler interprets it is significantly different. The = sign is actually called the 'match' operator. It will use the value on the left and compare it to the value on the right to determine if they are a match.
Tuples are used frequently in Elixir code to enable returning multiple values from a function. Typically, you would come across a {status, value} tuple, for example:
{:ok, return_value} = do_stuff()
do_stuff() must return a tuple which matches that structure (otherwise Elixir will raise a ‘MatchError’), and return_value is now bound to the second item in the tuple returned from this function.
This is basically how pattern matching works, but the real beauty is how you use it in various contexts, for example:
When a function can return multiple values, such as the {status, value} tuple we came across earlier:
case do_stuff() do
 {:ok, value} -> value
 {:error, _} -> raise "Oh no!"
end
In function heads you can pattern match on parameters, to only run when particular requirements are met:
def my_func({:ok, value}), do: value
def my_func({:error, _}), do: raise "Oops!"
IO.puts my_func({:ok, "hello"}) # "hello"
You can even match on lists:
[first, second, third] = [1, 2, 3]
And decompose data structures
%{value: value} = map_func()
There are so many examples of pattern matching in Elixir because it’s incredibly useful and powerful, and also very performant when compared to traditional methods.
In a nutshell, that’s pattern matching!


First Impressions of Elixir
Fri, 06 Jan 2017 09:00:00 +0000
Elixir is a functional programming language based on Erlang. I’m told it’s very similar to Ruby, with a few tweaks and improvements to the developer experience and language syntax.
[Detour – buckle your seatbelts]
I’m drawn to Elixir because of my interest in Functional Programming, generally and specially in JavaScript. I started by learning techniques to make pure functions so that I could more easily test my code. Then, I progressed into learning about composition, currying and partial application in JavaScript, particularly as it was useful to know when using Redux.
[take a breather, that’s a lot of buzz]
…
[enough breathing]
So, then (without mastering any of the above) I decided to try my luck with learning functional programming theory (which is basically just math). That was fun too.
I have twisted and turned through it all, trying to figure out what FP means, and to be honest I’m enjoying it – so why not try to pick up a language that is purely (get it?) functional.
[back to Elixir…/Detour]
Now you know why I’m interested in learning a functional programming language such as Elixir, let’s go on a mythical journey together to see what I’ve learned.
Learning new things is hard
One of the best things I’ve noticed about Elixir – given at this point in time I’ve used it for about a week – is that it’s incredibly focussed on developer experience.
I think the creator of Elixir (José Valim) must have looked at Erlang and thought we could do better than this. The best people take great ideas and make them easier for other people to learn.
Here’s a few things Elixir does or helps you do as a developer (in list format because who doesn’t like a good list?)

Built in unit testing (run ‘mix test’)
Encourages documentation through making it part of the module distribution
Interactive Elixir – iex> to allow running code in a terminal
More approachable syntax than Erlang
Pattern Matching (what’s that? – keep reading)

That’s plenty of things. What more do you want?
All of this makes it easier for a willing developer to pick up Elixir and give it a go. Nobody wants to be fiddling around with their development environment in a shameful attempt to start learning a new language.
FP is FP
Functional programming is fucking powerful. It shouldn’t be underestimated. I don’t want to spend any more than 3 seconds walking through my code (and dissecting my brain that day) because I wasn’t bothered to create a function with a clear interface and signature. I am starting to think a function with more than say, 10 lines, is too long and doing too much.
It’s easy to say Elixir is better because it enforces strict rules about what and how you should write your code but I think it would still be possible to write shitty code in any language, just easier in others.
Given that Elixir is an FP language it makes sense that all of the Elixir modules follow its general principles. Taking my early experiences with Elixir into account, I can say I appreciate the strict-ness of the language coming from JavaScript, but there’s also something to be said about the creativity and expressiveness you can have writing functional JavaScript – and there’s plenty of people talking about that now.
It’s interesting that I haven’t seen more people checking out Functional languages after discovering the power of it in JavaScript. Maybe they just haven’t written a kickass article about it?
WTF? (What the feature?!)
In keeping with Elixir’s functional ties, there’s a feature called Pattern Matching, which I’m very excited to learn more about. I don’t think this is an Elixir-only feature but it’s certainly the first time I have come across it.
The idea (from what I can gather) is that you can define a function as a copy of another function with values in place of parameters and when the value is equal to what is passed in, it will run that function, instead of another further down.
As an example, I had a recursive function that takes a list, but I only want it to run when there are items in the list (otherwise it would get stuck in a recursive loop).
My instinct would have been to use an if statement to check whether there are items and return early – but with Pattern Matching you can say when the first parameter is an empty list, just return early. You have to make sure the pattern matching function is defined before any function you want to override.
This concept separates the two cases into two functions, rather than having one function that handles all cases. As a beginner that is a difficult thing to realise but I’m interested to see whether it improves code readability.
Where do we go ? (Where do we go now)
Functional programming has been around for ages, but as software engineering on the web matures, developers begin to question how we’ve done things and look for something better.
Elixir sounds like a new challenge to me and has some good things going for it. Now seems like the perfect time for me to pick it up so my goal will be to become more comfortable with it and be able to start a project from scratch and build something myself without a tutorial helping me along.
TLDR

Elixir looks like a fun way to learn more functional programming concepts
Elixir’s focus on documentation, tests and readable code is what motivates me to learn more about it – and from other reading it seems highly scalable.
The developer experience seems to have been thought out – making it appealing and easy to get started.
The package management system, including package distribution seems similar to NPM.

So far
If you’ve read this far, I congratulate you on your job well done. Take the rest of the day off.
Here are some links to Elixir work I’ve done so far:

jackmarchant/misc
jackmarchant/todo-elixir
misc



No excuses, write unit tests
Tue, 29 Nov 2016 12:22:00 +0000
Unit testing can sometimes be a tricky subject no matter what language you’re writing in. There’s a few reasons for this:

There’s fear unit testing will take time your team doesn’t have
Your team can’t agree on an acceptable level of test coverage or get stuck bike-shedding*
People are frustrated by breaking tests when changing code

First let’s invest a bit of time understanding what I mean by unit testing. A unit can be any block of code that can be isolated and executed on its own. This can be a function or even a group of functions, although the latter makes it more difficult to test due to many moving parts.
A function is easily testable, if it always produces the same output, that is, it returns the same thing from inside the function, when given the same inputs (parameters).
It’s great for testing because we can make assumptions and set expectations based on those return values. The idea being, that when the test passes, the function still satisfies the requirements in the assertions, regardless of how it gets to that result.
An example of simple testing:
import { it } from 'mocha';
import { expect } from 'chai';

/**
 * Add numbers together
 *
 * @param {int} numbers One or many numbers to add
 */
const add = (...numbers) => {
  return numbers.reduce((acc, val) => {
    return acc + val;
  }, 0);
};

it('should add numbers', () => {
  const expected = 15;
  const actual = add(1, 2, 3, 4, 5);

  expect(actual).to.equal(expected); // true
});

/**
 * Subtract numbers from an initial number
 *
 * @param {int} initialNumber The number we start from when subtracting
 * @param {int} numbers       One or many numbers to subtract
 */
const minus = (initialNumber, ...numbers) => {
  return numbers.reduce((acc, val) => {
    return acc - val;
  }, initialNumber);
};

it('should minus numbers', () => {
  const expected = 5;
  const actual = minus(15, 5, 3, 2);

  expect(actual).to.equal(expected); // true
});
You can go as far with these tests as you like. If we wanted to we could add tests for what happens when the add and minus functions are passed values that are not numbers, does it need to deal with negative numbers?
Adding tests for even the simplest functions can provide you with more information about:

How hard the function is to use (number of parameters, understanding of output by function name)
Potential risks for the function living in the wild and being used by other developers
Whether the function is doing too much, either because you have to mock the world for it to even run, or if you are asserting too many things per function

There’s so much to gain from writing tests, and so much to lose if you don’t.
You’ve got time for unit testing
Unit testing your code takes some extra time upfront, because of course, you need to write extra code – the tests.
Then, weeks or months go by after you’ve written those tests and you make a change to a function that was tested, and the test breaks. Bugger. Now you’ve got to go in and fix the test.
I’ve heard people complain that fixing broken tests is hard, time consuming and/or a waste of time. My response is where would you rather be fixing that bug? Would you rather it be in production while people are angry features are broken, or in a unit test, prolonging the time it takes to complete a task?
If you change an API, things should break. If tests did not break, and that code went out to production, now everywhere else the code was used is now broken, you’ve got 99 problems but lucky you, testing ain’t one.
I’ll tell you what, most teams don’t have time to fix bugs in production, yet time is always made for it. Everyone knows fixing bugs that occur in production is important, from managers to developers, but we’re always waiting until they’re in production to fix them.
To me, it seems as if we could move the bug fixing earlier in the process and spend more time focussing in code clarity which promotes understanding of the code. Half of the time spent fixing a bug is figuring out how the hell it happened. If you had a unit test it would tell you as soon as you change something and run the test.
Writing tests gets easier the more you do it. You will find that after a while you start writing code that’s easily testable because you were thinking about how you would test that code, while you were writing it! Imagine that!
Just write the damn test
Engineers are renowned for over-engineering. We think in abstractions and it’s normal to think too much about what should be a simple solution. The hardest part of that is just realising you might be going too far.
Often, when new things pop up, we think of the best solution without addressing the core problems in an efficient manner.
Deciding on team coding best practices is great. Including test coverage, what and how to test among other things is good. Preventing your team from trying things and learning from mistakes is bad.
Don’t let it stand in your way of writing the damn test. A good rule of thumb for any new software is:

First, make it work. Then make it right.

This rule can be applied to unit testing in a number of ways, but the most useful I’ve found is to first write the code to make the thing work, preferably small functions and then write a test for it.
Now that you’ve got a tested function, change the internal code of that function and see if the test still passes.
Simply – Write. Test. Refactor.
Dealing with broken tests
After you’ve been going writing tests for a while, you should start to notice more things you change will break existing tests. This is a good thing. Don’t underestimate the power of a broken test.
Firstly, it forces the developer that broke the test to understand a bit more about how a piece of code will run in. As in what inputs and outputs are expected, depending on how well it was tested.
Secondly, it forces any API changes to be well thought-out and potentially discussed as a team depending on the size of the change.
Third, and most importantly, is that you found out in your terminal, as opposed to when a customer tried to do something.
Just like anything, you can go too far with testing. And it depends on the application as to how deep you go into unit testing.
In my experience, I see no reason good enough not to at least have some unit testing. Run the code in some expected scenarios and see what happens.
It’s just like when you deploy your application and start clicking around on buttons, interacting with the app.
You’re not going to just deploy your application and forget it even exists!
Or would you? Start unit testing today. Start small and work your way up.
*Bike-shedding refers to the time spent solving relatively unimportant issues when the larger problem should be solved before addressing minor details.