[4/x] OCaml Setup, Hardcaml Basics, and Project Plan

Sasha

Hi, welcome back to our Hardcaml MIPS project! Today, we'll be setting up OCaml, discussing the basics of Hardcaml with a simple circuit, and going over our plans/vision for this project.

This post assumes understanding of basic OCaml syntax. The examples are as simple as we can make them, but if you've never worked with OCaml before, you might be a bit confused. I strongly recommend reading through this guided tour for the essentials: the book it's from (Real World OCaml) is what I've been using to learn the language.

If you're interested in the end-result of this post, I've tagged it as v0.1.2 on GitHub.

OCaml Project Setup

Before we can write any Hardcaml, we need to set up some basic infrastructure so that our project will compile/run. The first step is setting up OCaml and its package manager Opam, which is well documented in this guide. Make sure you also set up your editor of choice via the instructions on that site. For those using VSCode like me, I had trouble with the top-rated OCaml and Reason IDE extension, but OCaml Platform worked fine. Don't forget to install the OCaml language server via opam install ocaml-lsp-server, otherwise the extension can't do much.

For purposes of this series, we'll be using the 4.13.1 version of OCaml:

opam switch create hardcaml 4.13.1

Before we install Hardcaml, we need to configure our opam switch to use the bleeding-edge versions of hardcaml and related Jane Street libraries:

opam repo add janestreet-bleeding https://ocaml.janestreet.com/opam-repository

opam repo add janestreet-bleeding-external https://github.com/janestreet/opam-repository.git#external-packages

Finally, install Hardcaml:

opam install hardcaml hardcaml_waveterm ppx_jane ppx_expect ppx_deriving_hardcaml

Before continuing, I want to give a definition of the tools we'll be using in this project:

opam is the OCaml package manager. That's what you'd use to install OCaml packages and switch between versions of OCaml if you need to.
dune is a build system for OCaml. We'll be using it to manage our codebase, including generating executables and running automated tests.
merlin provides a bunch of useful IDE features for OCaml, like identifying types of values on hover.
ocamlformat is, unsurprisingly, a code formatter for OCaml.

With that done, let's start setting up the project. We'll need 3 components:

First and foremost, we'll need the actual code for our MIPS CPU.
We'll also want a suite of automated tests for the modules in our CPU.
Finally, since we want to generate Verilog from our Hardcaml code, we'll need to create a simple executable that tells Hardcaml to convert our design to Verilog.

We'll also need to include some top-level configuration files.

Here's how our project structure will look:

hardcaml_mips
├── .github
│   └── workflows
│       └── test.yml
├── lib
│   ├── dune
│   └── datapath.ml
├── test
│   ├── dune
│   └── test_datapath.ml
├── .gitignore
├── .ocamlformat
├── CHANGELOG.md
├── dune
├── dune-project
├── LICENSE
├── main.ml
└── README.md

Let's go over these one at a time.

Top-level configuration

These are relatively standard:

.github/workflows/test.yml (link) is a GitHub Actions script, so our automated tests run every time we push to GitHub. It's relatively straightforward so I won't discuss it at depth here.
.gitignore (link) is a standard gitignore file, which I'm using to ignore some auto-generated files, IDE config, and reference materials.
CHANGELOG.md (link) is a concise list of major changes per version.
LICENSE (link) is just a license file. We're using the MIT License, which is very simple and permissive, for this project.
README.md (link) briefly describes the project and links to this blog.
.ocamlformat (link) is necessary for ocamlformat to work. All it contains is version=0.18.0, which uses the default ocamlformat configuration.
dune-project (link) is a project-level configuration file for dune. We'll also need individual dune files wherever we have source code, but we'll discuss that later. In dune-project, we specify various metadata/config including the version of dune we're targeting and the project name/description/authors/license type. We also set generate_opam_files to true so that running dune build will automatically create .opam files, which make our project's libraries installable and publishable via opam.

MIPS Source Code

The actual source code for our MIPS CPU is located in lib. We currently have 2 files: a dune config file, and a datapath.ml source file. Let's start with dune:

(library
 (name Mips)
 (public_name hardcaml_mips)
 (libraries hardcaml)
 (preprocess
  (pps ppx_deriving_hardcaml)))

(include_subdirs unqualified)

This file declares several things:

The code in this directory is a library. In other words, it's code that might be used by other libraries or executables.
This library is called Mips, so top-level modules from this library can be accessed via Mips.MODULE_NAME.
Other libraries / dune config files can require this library via the public name hardcaml_mips. When we cover the testing and executable structures, you'll see that hardcaml_mips is included in the libraries part of their dune files.
This code will use the hardcaml library.
This code will use the ppx_deriving_hardcaml PPX preprocessor. PPXs essentially modify your code before compilation, and are used for metaprogramming in OCaml.
We want to consider code in subdirectories as part of the same Mips module. This lets us split up our code into subdirectories very easily.

I recommend reading the dune stanza reference to learn more about dune syntax.

Now, let's take a look at the datapath.ml file. This is going to be the top-level module of our MIPS CPU. In time, all our logic will be defined there, either directly or (mostly) as instantiations of other modules. For now, we just want to test that we can test and generate any circuit, so it'll be extremely basic.

open Hardcaml.Signal

module I = struct
  type 'a t = { clock : 'a; suffix: 'a } [@@deriving sexp_of, hardcaml]
end

module O = struct
  type 'a t = { pc : 'a [@bits 5] } [@@deriving sexp_of, hardcaml]
end

let create (i : _ I.t) = { O.pc = (of_string "1111") @: i.suffix }

As I mentioned in my previous post, we want to think of most of our modules as functions from some input signals to some output signals. That's exactly what we're doing here. We have input (I) and output (O) modules that represent the structure of our inputs and outputs. We also have a create function, which represents the logic of our circuit. Since we want to keep things extremely simple, all we're doing for now is outputting a single 5-bit signal called pc that consists of 1111 concatenated with the single-bit suffix input. This is not realistic or useful (especially since the MIPS program counter is 32 bits), but easy to understand.

This pattern of declaring the input and output structures as modules is quite common in Hardcaml. It's known as Hardcaml interfaces, and allows you to use a variety of functors to avoid boilerplate when testing simulations, generating verilog, or using a hierarchy of modules (which we'll need to do to keep our code clean).

Note the use of [@bits 5] and [@@deriving sexp_of, hardcaml]. These are invocations of the PPX system I mentioned earlier, and are used to automatically generate a bunch of boilerplate for input/output interfaces.

Testing Setup

Similarly to our source code, our test directory has a dune file and a test_datapath.ml file, which contains tests for the corresponding datapath.ml.

dune is extremely similar to what we saw before:

(library
 (name test_mips)
 (inline_tests
  (flags (-verbose)))
 (libraries hardcaml hardcaml_waveterm hardcaml_mips)
 (preprocess
  (pps ppx_jane ppx_expect)))

(include_subdirs unqualified)

A few key differences:

We include an inline_tests field, which allows dune to automatically find and run our test code when we run dune test
Our libraries field also requires hardcaml_waveterm, which allows us to use ASCII to describe the signals in our circuit, and hardcaml_mips, which corresponds to the public_name field back in our source project.

The testcase code itself is fairly standard: we generate a Simulation module by running our input/output structure through a functor, then use that to observe how the output of our datapath circuit changes for various inputs. I strongly recommend this article by the author of Hardcaml to learn about how Hardcaml testing works.

Verilog Generation Executable

Our final subsystem is used to generate Verilog from our design, and once again has a dune file and a source code file (this time, main.ml). dune is very similar to what we've seen so far, with one key difference:

(executable
 (name main)
 (libraries hardcaml hardcaml_mips))

In the previous files we saw, the top-level stanza was library. Here, it's executable. This means that dune will automatically generate a main.exe file in _build/default/, which will print Verilog source code when run. That executable runs the code in main.ml:

open Hardcaml
open Mips

module MipsCircuit = Circuit.With_interface(Datapath.I)(Datapath.O)

let circuit = MipsCircuit.create_exn Datapath.create ~name:"datapath"

let () = Rtl.print Verilog circuit

The structure is relatively simple:

We use the Circuit.With_interface functor to create a helper module.
We'll use that to package our Datapath.create circuit implementation in a Hardcaml Circuit.
Finally, we run that Hardcaml circuit through the Hardcaml RTL generation module, printing its Verilog equivalent.

Project Plans

Hopefully that gives you a slight understanding of how to structure an OCaml project, and how to use some basic parts of Hardcaml.

Before concluding, I wanted to lay out a general roadmap for this project. We are still learning Hardcaml, so this is subject to change.

(this post) To start, we want to show that we can design, test, and generate Verilog for a very basic circuit.
Because our CPU will be relatively complicated, we'll want to split it into a hierarchy of circuits, representing the 5 pipeline stages. Each of those will in turn be split into simple circuits. In this next step, we'll want to show that we can test and generate verilog for multi-module systems.
Our design requires 3 memory blocks: instruction memory, register memory, and data memory. The latter 2 can be implemented via Hardcaml's multiport_memory function, but instruction memory is trickier because it requires an initial value. We'll need to figure out how to support read only memory.
In the MIPS design, there are stateful registers between each stage. We want to keep our stages as mostly pure functions (with an exception for writing to memory blocks), so we'll need to use the Always DSL to describe all our stateful logic, centralized in the Datapath module.
Now that we've gotten a grasp of Hardcaml, we'll start implementing our CPU's pipeline stages one by one. We might do separate blog posts for each stage or consolidate interesting observations in one, I'm not sure yet.
With the core of our system in place, we'll start implementing support for branch/jump instructions, as well as useful features like forwarding.

To keep things simple, we'll represent data and instruction memory as registers in our system. If time allows, we might try using hardcaml_xilinx to support actual RAM for data memory, although it'll be hard to test as we don't have actual FPGAs.

Hardcaml Observations

Now that we've started working with Hardcaml, each blog post will include some observations about Hardcaml. This will include things we struggled with, features we liked, issues we struggled with, etc.

One of my complaints about Vivado is that you could have mismatched wire widths when instantiating modules. In other words, you could feed a 4-bit wire into a module that takes a 5-bit input, and it would work. Unfortunately, this can't be caught by OCaml's type system, because the width of a wire is a value, and OCaml types can't really be parametrized by values (apparently there's something called dependent types that might solve this). This means that wire width mismatches can't be caught by Hardcaml. That being said, there's a function to assert wire widths match, but it doesn't seem to be called automatically.
I really like the ability to represent combinational circuits as pure functions. It just feels right.
Because inputs and outputs are passed as records, Hardcaml actually enforces that required inputs are present, and won't let typos slide. It's a small thing, but I'm very happy to see it.