Author:

Last updated:

Goga Patarkatsishvili | Twitter

August 22, 2024


The fact that the problems we continuously face in software engineering—fault tolerance, scalability, distribution, high availability, and hot-swapping—were actual nearly half a century ago brilliant engineers put immense effort into addressing these challenges - is unbelievably amazing. One of the most elegant solutions to emerge was Erlang.

Erlang was built in the late 1980s to support massively scalable, real-time systems with extreme uptime requirements. It was initially developed by Ericsson to power telecommunication systems, which demanded constant availability, fault tolerance, and the ability to handle millions of concurrent connections. Erlang thrived in this environment, becoming a cornerstone for systems requiring unmatched reliability and scalability.

And then in 2012 appeared Elixir. Running on the Erlang VM (BEAM), Elixir inherits all of Erlang's powerful capabilities while introducing a more developer-friendly syntax and modern tooling. Elixir provides seamless compatibility with the Erlang ecosystem, allowing developers to build scalable, fault-tolerant systems while benefiting from a smoother coding experience. Elixir is designed to make concurrent and distributed programming easier, while still retaining the robustness and performance of Erlang.

Elixir's concurrency model is built around lightweight processes managed by BEAM, allowing millions of processes to run simultaneously. These processes communicate effortlessly, leveraging immutable data, which eliminates the need for complex synchronization mechanisms and ensures data consistency across concurrent operations. As a functional language, Elixir promotes a clear and maintainable coding style that leads to more reliable applications. This, combined with BEAM’s fault-tolerant design, makes Elixir a strong choice for building systems that need to be scalable, distributed, and highly available.


Give a glance to the Elixir code

Parallel Tasks with Automatic Retries

This example demonstrates how Elixir handles concurrency and fault tolerance in a simple yet powerful way. Using Task.async_stream/3, we execute tasks in parallel, automatically retry failed tasks, and log successes, failures, and retries. This approach shows Elixir's ability to manage parallel processes efficiently without needing advanced constructs like GenServers or Supervisors.

defmodule Worker do
  # Worker function that performs a task with a chance of failure
  def perform(task_number) do
    if :rand.uniform(4) == 1 do  # 25% chance of failure
      raise "Task #{task_number} crashed!"
    end

    # Simulate a task with a 1-second delay
    :timer.sleep(1000)
    IO.puts("Task #{task_number} completed successfully by process #{inspect(self())}")
    :ok
  end
end

# Helper function to retry a task on failure
defmodule TaskRunner do
  def retry_task(task_number, retries) do
    try do
      Worker.perform(task_number)
    rescue
      e ->
        IO.puts("Task #{task_number} failed: #{Exception.message(e)}")
        if retries > 0 do
          IO.puts("Retrying Task #{task_number}...")
          retry_task(task_number, retries - 1)
        else
          IO.puts("Task #{task_number} failed after multiple retries.")
          :error
        end
    end
  end
end

# Run tasks concurrently with automatic retry
1..100
|> Task.async_stream(fn task_number ->
  TaskRunner.retry_task(task_number, 3)  # Retry each task up to 3 times if it fails
end, max_concurrency: 10)                # Run 10 tasks concurrently
|> Stream.run()

Explanation:

This code runs 100 tasks concurrently with a maximum of 10 tasks executing at once. Each task has a 25% chance of failure and is retried up to three times before being marked as failed. The result is printed, showing successful completions, failures, and retries.

The example highlights Elixir’s ability to manage parallel processes without getting into more advanced concepts like GenServers or Supervisors(they are Essential for Larger Applications).