Diving into Ecto: Part 1

First, a disclaimer. I have not attempted to blog in quite some time. I'm a bit rusty and I thank you in advance for your patience. Also, I'm an Elixir wannabe and by no means an expert. If I'm doing things wrong or showing bad examples please let me know.

I want to take a deeper look at Elixir's Ecto library. Going beyond the surface requires us to setup a data model with at least a little bit of complexity and generate an interesting dataset. That much should make for a decent sized post. I will defer querying until the next post. I'm attempting to give enough detail to explain what needs to be done without dumping in the entire source. You can find the code in all its hacky glory here:

https://github.com/parkerl/fishing_spot

I have chosen fly fishing as the muse for this series of posts. I know the domain well and enjoy day dreaming about big fish. I'm hoping I can convince my wife that I need to do some field research. Fly fishermen who have more time on their hands than I do often go on multi-day trips. On the trip they fish at multiple locations which might include rivers, streams or lakes. When a fish is landed the fisherman wants to record the location of the catch, the weight, length and species of the fish, and the type of lure or fly used. (The accuracy of this information can be somewhat suspect but that is a different topic altogether).

The schema drawn below attempts to capture in a very normalized way the data described above.

Let's get started!

Run mix new fishing_spot --sup && cd fishing_spot.

We are going to use postgres as our backend. I'm going to assume you already have it setup. A quick "google" should help if you don't.

Edit mix.ex and add the postgrex and ecto dependencies to the deps/1 function:

  defp deps do
    [
      {:postgrex, "~> 0.9.1"},
      {:ecto, "~> 1.0.0"}
    ]
  end

Add both to the applications list:

  def application do
    [applications: [:logger, :postgrex, :ecto],
     mod: {FishingSpot, []}]
  end

Run mix deps.get && mix deps.compile.

Next we need to create a Repo for the project. The Repo is basically the project's connection to the database.

Run mix ecto.gen.repo. This should create a file lib/fishing_spot/repo.ex with the following content:

defmodule FishingSpot.Repo do  
  use Ecto.Repo, otp_app: :fishing_spot
end  

When mix ecto.gen.repo is finished, it helpfully prints this message:

Don't forget to add your new repo to your supervision tree  
(typically in lib/fishing_spot.ex):

    worker(FishingSpot.Repo, [])

Open up lib/fishing_spot.ex and you should see these lines:

    children = [
      # Define workers and child supervisors to be supervised
      # worker(FishingSpot.Worker, [arg1, arg2, arg3])
    ]

We need to add our repo into that list:

children = [  
  worker(FishingSpot.Repo, [])
]

Edit the config/config.exs and we can see the command also added the following to our config file:

config :fishing_spot, FishingSpot.Repo,  
  adapter: Ecto.Adapters.Postgres,
  database: "fishing_spot_repo",
  username: "user",
  password: "pass",
  hostname: "localhost"

Let's change both username and password to fishing_spot in config/config.exs. We need to create that user in postgres. On my machine I do this by running psql -d template1. At the psql prompt our user can be created with:

CREATE USER fishing_spot WITH CREATEDB PASSWORD 'fishing_spot';  

The Schema

Migrations

Let's create migrations to setup our schema in the database. We could use a single migration file but I'm going to create a migration for each table just for fun. Create a file called migrations with the following content:

add_fly_types_table  
add_fish_species_table  
add_trips_table  
add_location_types_table  
add_locations_table  
add_fishermen_table  
add_fish_landed_table  
add_locations_trips_table  
add_fishermen_trips_table  

When you want to execute the same command many times with different inputs xargs is your best bet. By executing xargs -p -L 1 mix ecto.gen.migration < migrations we tell xargs to use each line of the file as the arguments for the mix command. Using the -p flag will tell xargs to ask before executing each command. This is useful here because the migrations are timestamped. Executing without pausing for confirmation can cause mix to create migrations with the same timestamp.

Migration files will be created in the priv/repo/migrations directory. Here are the files I have:

20150520032648_add_fly_types_table.exs  
20150520032650_add_fish_species_table.exs  
20150520032652_add_trips_table.exs  
20150520032656_add_location_types_table.exs  
20150520032657_add_locations_table.exs  
20150520032658_add_fishermen_table.exs  
20150520032700_add_fish_landed_table.exs  
20150520032703_add_locations_trips_table.exs  
20150520032705_add_fishermen_trips_table.exs  

Next I need to enter all the migration code into those files. Here is the change/0 function for the fish_landed migration which is probably the most complicated. Relationships to other tables are setup with references.

  def change do
    create table(:fish_landed) do
      add :date_and_time,    :datetime
      add :weight,           :decimal
      add :length,           :decimal

      add :fisherman_id, references(:fishermen)
      add :location_id, references(:locations)
      add :fly_type_id, references(:fly_types)
      add :fish_species_id, references(:fish_species)

      timestamps
    end
  end

I'm not going to list the code for the rest of the migrations. You can either copy them from the repo or take a stab at building them out yourself.

Run mix do ecto.create, ecto.migrate. This will create the database and run all the migrations. Let's take a look at the table that is created. Fire up psql with psql -d fishing_spot_repo and then enter \d fish_landed.

         Column      |            Type             |   
-----------------+-----------------------------+---------
 id              | integer                     | not null
 date_and_time   | timestamp without time zone |
 weight          | numeric                     |
 length          | numeric                     |
 fisherman_id    | integer                     |
 location_id     | integer                     |
 fly_type_id     | integer                     |
 fish_species_id | integer                     |
 inserted_at     | timestamp without time zone | not null
 updated_at      | timestamp without time zone | not null
Indexes:  
    "fish_landed_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:  
    "fish_landed_fish_species_id_fkey" FOREIGN KEY (fish_species_id) REFERENCES fish_species(id)
    "fish_landed_fisherman_id_fkey" FOREIGN KEY (fisherman_id) REFERENCES fishermen(id)
    "fish_landed_fly_type_id_fkey" FOREIGN KEY (fly_type_id) REFERENCES fly_types(id)
    "fish_landed_location_id_fkey" FOREIGN KEY (location_id) REFERENCES locations(id)

The table is created with all the correct field types for postgres and the foreign key constraints are added. However, indexes are not created for foreign keys by default. Let's add one to the fisherman_id column. We can add the following to the last line of the change/0 function above.

create index(:fish_landed, :fisherman_id)  

The easiest way to rerun the migrations is to completely drop and recreate the database. Run mix do ecto.drop, ecto.create, ecto.migrate.

Oh no! That didn't work. We get a common error no function clause matching. While this error message is not very helpful, it usually indicates that the pattern match of a function failed.

** (FunctionClauseError) no function clause matching in Ecto.Migration.index/3
    (ecto) lib/ecto/migration.ex:391: Ecto.Migration.index(:fish_landed, :fisherman_id, [])
    _build/dev/lib/fishing_spot/priv/repo/migrations/20150521042113_add_fish_landed_table.exs:18: FishingSpot.Repo.Migrations.AddFishLandedTable.change/0
    (stdlib) timer.erl:197: :timer.tc/3
    (ecto) lib/ecto/migration/runner.ex:22: Ecto.Migration.Runner.run/6
    (ecto) lib/ecto/migrator.ex:113: Ecto.Migrator.attempt/6
    (ecto) lib/ecto/migrator.ex:63: anonymous fn/4 in Ecto.Migrator.do_up/4
    (ecto) lib/ecto/pool.ex:292: Ecto.Pool.with_rollback/3
    (ecto) lib/ecto/adapters/sql.ex:566: Ecto.Adapters.SQL.transaction/8

Sure enough, the index/3 function takes a list as the second argument. Let's try again with

create index(:fish_landed, [:fisherman_id])  

Much better. Now we can see in psql that the index was created.

"fish_landed_fisherman_id_index" btree (fisherman_id)

Hmm...I wonder how far we can push Ecto's migrations? Does Ecto support composite primary keys? Let's see if we can find out. I see from the docs here that we can indicate which column is the primary key by passing primary_key: true. Maybe we should try:

    create table(:all_the_things) do
      add :some_string, :string, primary_key: true
      add :someother_string, :string, primary_key: true

      timestamps
    end

Uh oh! Things are not looking good. I get this error:

06:17:56.322 [info]  create table all_the_things  
** (Postgrex.Error) ERROR (invalid_table_definition): multiple primary keys for table "all_the_things" are not allowed
    (ecto) lib/ecto/adapters/sql/worker.ex:26: Ecto.Adapters.SQL.Worker.query!/4
...    

Apparently support has not been added for composite keys. It looks like it was discussed here but did not make it into ecto 1.0.

Let's see what else we can do. Can we use a string as a primary key?

create table(:all_the_things) do  
  add :some_string, :string, primary_key: true

  timestamps
end  

WTF?!@#$ When we migrate we get the same error.

** (Postgrex.Error) ERROR (invalid_table_definition): multiple primary keys for table "all_the_things" are not allowed

It turns out we have to turn off the creation of the default id column. When can pass primary_key: false to table/2 like so:

create table(:all_the_things, primary_key: false) do  
  add :some_string, :string, primary_key: true

  timestamps
end  

Creating Models

Next we need to flesh out the models.

Run:

mkdir lib/fishing_spot/models  
touch lib/fishing_spot/models/{fish_landed,fish_species,fisherman,fisherman_trip,fly_type,location,location_trip,location_type,trip}.ex  

Again I'm not going to show the code for all the models. Here is the FishLanded model which has several relationships to other models:

defmodule FishingSpot.FishLanded do  
  alias FishingSpot.Fisherman
  alias FishingSpot.FishLanded
  alias FishingSpot.FishSpecies
  alias FishingSpot.FlyType
  use Ecto.Model

  schema "fish_landed" do
    timestamps
    field :date_and_time,    Ecto.DateTime
    field :weight,           :decimal
    field :length,           :decimal

    belongs_to :fisherman,     Fisherman
    belongs_to :location,      Location
    belongs_to :fly_type,      FlyType
    belongs_to :fish_species,  FishSpecies
  end
end  

As you can see, we have to specify all the fields and relationships within the model.

Let's take a look at the fisherman.ex file as well.

defmodule FishingSpot.Fisherman do  
  alias FishingSpot.FishermanTrip

  use Ecto.Model

  schema "fishermen" do
    timestamps
    field :name
    field :date_of_birth, Ecto.Date

    has_many :fishermen_trips, FishermanTrip
    has_many :trips, through: [:fishermen_trips, :trip]
    has_many :fish_landed, FishLanded
  end
end  

In this model you can see a "has many through" relationship from fishermen to trips.

Finally we are ready to create records. Fire up iex -S mix and enter the following:

FishingSpot.Repo.insert %FishingSpot.Fisherman{name: 'Ahab'}  

Damn!

** (Ecto.ChangeError) value `'Ahab'` for `FishingSpot.Fisherman.name` in `insert` does not match type :string
      (ecto) lib/ecto/query/planner.ex:33: anonymous fn/6 in Ecto.Query.Planner.fields/4
    (stdlib) lists.erl:1262: :lists.foldl/3
      (ecto) lib/ecto/query/planner.ex:21: Ecto.Query.Planner.fields/4
      (ecto) lib/ecto/repo/model.ex:67: anonymous fn/10 in Ecto.Repo.Model.insert/4

Pattern matching strikes again. This is yet another common gotcha in Elixir. In languages like Ruby and Javascript single and double quotes can be used mostly interchangeably. In Elixir single quotes denote a Char list. We have to use double quotes for strings.

FishingSpot.Repo.insert %FishingSpot.Fisherman{name: "Ahab"}  

BOOM!

** (ArgumentError) repo FishingSpot.Repo is not started, please ensure it is part of your supervision tree
    (ecto) lib/ecto/adapters/sql.ex:548: Ecto.Adapters.SQL.transaction/3

If you were paying attention above you will not get this error. However, I did not heed my own advice and forgot to add my Repo to the supervision tree in lib/fishing_spot.ex.

Once we fix that up we can finally see the record inserted. Ecto helpfully logs the exact SQL that is sent to postgres:

16:12:23.749 [debug] BEGIN [] OK query=297.5ms queue=15.0ms

16:12:23.845 [debug] INSERT INTO "fishermen" ("date_of_birth", "inserted_at", "name", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id" [nil, {{2015, 9, 6}, {22, 12, 23, 0}}, "Ahab", {{2015, 9, 6}, {22, 12, 23, 0}}] OK query=3.8ms

16:12:23.855 [debug] COMMIT [] OK query=3.0ms  

Sample Data

Now that we have the Ecto models fleshed out and everything working, it's time to generate some data to play with. Let's create a file priv/repo/seeds.exs with a module called FishingSpot.Data to do the work via the generate/0 function.

We will want to run this function over and over as we build up our implementation. Let's add this line outside the FishingSpots.Data module at the very end of the file:

FishingSpot.Data.generate  

This will cause the data generation function to be run every time we load this file.

We can run it with either mix run priv/repo/seeds.exs or by firing up IEx with iex -S mix. Within the IEx session run c("priv/repo/seeds.exs") to compile and run the file.

The first thing we will want is to ensure the database is in a pristine state before generating data. The easiest way to do that is to drop and recreate the database. We can simply lean on Ecto's built in mix tasks.

The generate/0 function should look like:

  def generate do
    Mix.Task.run "ecto.drop",     ["FishingSpot.Repo"]
    Mix.Task.run "ecto.create",   ["FishingSpot.Repo"]
    Mix.Task.run "ecto.migrate",  ["FishingSpot.Repo"]
    _generate
  end

(Note that I have aliased all the necessary FishingSpot modules so I don't have to prefix them.)

The actual data generation will happen in the private function _generate/0. Let's flesh that out now. It would be nice to start with some fish species. In Ecto, records are created by making a struct that has the same name as the model, and keys with the names of the underlying table columns like %ModelName{attribute: "some value"}. This struct is then passed to the Repo which handles persistence. We can insert a record for Rainbow Trout and see what ID is assigned by the database.

rainbow_trout = Repo.insert %FishSpecies{ name: "Rainbow Trout" }  
IO.puts rainbow_trout.id  

However, when we do this and try to run it we get the following error:

== Compilation error on file priv/repo/seeds.exs ==
** (UndefinedFunctionError) undefined function: :ok.id/1 (module :ok is not available)
    :ok.id({:ok, %FishingSpot.FishSpecies{__meta__: #Ecto.Schema.Metadata<:loaded>, id: 11, inserted_at: #Ecto.DateTime<2015-08-25T11:57:20Z>, name: "Rainbow Trout", updated_at: #Ecto.DateTime<2015-08-25T11:57:20Z>}})
    priv/repo/seeds.exs:24: FishingSpot.Data._generate/0
    (elixir) lib/kernel/parallel_compiler.ex:95: anonymous fn/4 in Kernel.ParallelCompiler.spawn_compilers/8

What in the world is going on here? The problem is that Ecto's insert/2 function follows the Erlang idiom of returning a tuple where the first element is an atom indicating the success or failure of the operation. The second element will be the newly inserted record. So in the code above the rainbow_trout variable would end up set to the atom :ok which is the idiomatic indicator for a successful operation. Of course the atom :ok doesn't have an id function, which causes the error. Looking more closely at the error message, we can see something interesting. The part (module :ok is not available) shows that Elixir is actually looking for a module called ok. This is because Elixir uses atoms for module names under the hood. This can lead to confusion if you have a bug like I did above where you accidentally call a function on an atom.

We have two options to fix our issue. We can pattern match the tuple to put the new record into the variable like so:

{:ok, rainbow_trout } = Repo.insert %FishSpecies{ name: "Rainbow Trout" }

or we can use Ecto's insert!/2 function.

rainbow_trout = Repo.insert! %FishSpecies{ name: "Rainbow Trout" }  

Let's use the latter for simplicity. We can write similar insert!/2 calls for the models FlyType, and LocationType.

Next we need to insert our fishermen using the Fisherman model. The Fisherman model has a date field called date_of_birth. This is the first time we've encountered one of Ecto's special datatypes, Ecto.Date. In order to create an Ecto.Date we again build a struct with year, month, and day keys. Inserting a fisherman then looks like:

mark = Repo.insert! %Fisherman{  
  name: "Mark",  
  date_of_birth: %Ecto.Date{ 
    year: 1970, 
    month: 1, 
    day: 2 
  }
}

The next piece of the puzzle is to connect records via foreign keys. For instance, Locations need a LocationType. We can simply put the id of the referenced record into the foreign key column of the other.

river = Repo.insert! %LocationType{ name: "River" }

white_river = Repo.insert! %Location{  
  name: "North Fork White River", 
  altitude: 5000, 
  lat: Decimal.new(40.051879), 
  long: Decimal.new(-107.458016), 
  location_type_id: river.id 
}

Now with that out of the way it's time to catch some fish! Or at least pretend like we did in the age old fishing tradition.

We will want to randomly assign all the attributes and associations of a bunch of FishLanded records to make things interesting. Let's see how we can accomplish that.

First we need some randomness and this is how we get it:

:random.seed(:erlang.now)

Next we need a big loop. Enum.each(1..1000, fn(_) -> should do the trick. Inside, we first need to randomly grab the records that should be associated with each FishLanded. I've stored all the records created up to this point in lists for this purpose. To get a random record from each let's try:

fisherman = Enum.shuffle(fishermen) |> List.first  
location = Enum.shuffle(locations) |> List.first  
fly = Enum.shuffle(flies) |> List.first  
trip = Enum.shuffle(trips) |> List.first  
fish = Enum.shuffle(fish_types) |> List.first  

Now we need to come up with a reasonable size for the fish.

length = Decimal.new(:random.uniform(30))  
weight = Decimal.new(:random.uniform(5))  

Finally we need to determine the date and time the fish was caught. We should make sure it happened between the start and end of the trip that was selected above. Conveniently in my made up data no trip spans multiple months. We can simply grab the starting day and ending day and choose a random day in between. Let's also just use the current local time. We can then use Ecto.DateTime.from_date_and_time to build the value we need to insert the records into the database.

%Ecto.Date{year: year, month: month, day: trip_start} = trip.start_date
%Ecto.Date{day: trip_end} = trip.end_date

trip_length = trip_end - trip_start  
day_caught = :random.uniform(trip_length)  
date_caught = %Ecto.Date{year: year, month: month, day: trip_start + day_caught}

date_and_time_caught = Ecto.DateTime.from_date_and_time(  
date_caught,  
Ecto.Time.local())  

Putting it all together we now can insert our randomized records:

Repo.insert %FishLanded{  
  date_and_time: date_and_time_caught, 
  weight: weight, 
  length: length, 
  fisherman_id: fisherman.id, 
  location_id: location.id, 
  fly_type_id: fly.id, 
  fish_species_id: fish.id
}

Here is my final version of seeds.exs.

We have 1000 lies...I mean records...in the database and we are ready to go fishing for some data. Stay tuned for my next post on Ecto's query interface.


Thanks!

I owe a huge thanks to my team at ATHN, Kim Barnes, David Madouros, Andrew Courter and Joe Santini. They reviewed the multiple iterations of this post and without their help it would be even more of a train wreck.

Also props to the entire Elixir community for being so welcoming and fun!