Welcome to Ceramic Hacker

Hi, I'm Alexander (Sasha) Skvortsov. I'm a computer science and math major at Penn State. I'll be using this blog to share some of my projects and thoughts, mainly focused around software engineering and pottery.
Hardcaml MIPSBlog

[9/x] Always DSL and the Control Unit

Hi, welcome back to our Hardcaml MIPS project! Today, we'll be exploring how we can use the Always DSL to write verilog-like code while keeping all of Hardcaml's safety checks. We'll use this to implement our CPU's control unit. If you'd like to see the end-result of this post, it's tagged as v0.6.1 on GitHub.

What is a Control Unit?

We can think of our CPU's stages as follows:

  1. Fetch
    1. Get the current instruction
  2. Decode
    1. Figure out what the instruction means
    2. Get current values of registers used by the instruction
  3. Execute
    1. Figure out what the ALU inputs are, depending on the instruction
    2. Execute some ALU operation on those inputs, depending on the instruction
  4. Memory
    1. Write or read to/from data memory, depending on the instruction
  5. Writeback
    1. Write to the register file, depending on the instruction

You've probably noticed that everything we do to process an instruction after we've decoded it depends on the instruction. MIPS has a lot of instructions, and each needs to be handled differently.

In a nutshell, the job of a control unit is to take an instruction, figure out what type it is, and output "control signals" for the rest of the CPU based on that type. These control signals include:

  • Whether the instruction writes to memory
  • Whether the instruction writes to the register file
  • Which ALU operation the instruction uses

It does so by checking the opcode and funct parts of the instruction against a hardcoded list. Recall that each instruction is 32 bits: the first 6 are the opcode, and the last 6 are the funct.

When we made this in Verilog, we did exactly that: we checked the opcode and funct against a bit list of options in a giantswitch statement (source code here). This isn't irredeemable, but there are a few downsides:

  • We label opcodes/functs using comments. It would be preferable to use variables, because changes to variables can be picked up by a compiler or static analysis.
  • We declare a list of signals for each instruction type. A more intuitive formulation would be some form of logic, e.g. "write to memory iff the instruction type is 'load word'".
  • All the logic ends up in one giant block, even though it does multiple things. Breaking it up into separate functions would be more readable/maintainable.

Shortly, we'll see how we can use Hardcaml's Always DSL to avoid some of these issues.

Before I continue, I want to note that there's one more job the instruction decode stage does: parsing out the instruction. There are actually several different instruction formats in MIPS:

And depending on the format/type of the instruction we're processing, we'll want to parse the instruction differently. For example, R-type instructions use rd for the writeback destination address, while I-type instructions use rt. In our Verilog version, we did this parsing separately, but it might be simpler to have it be part of the control unit.

A Better Design

For this project, we'll use a somewhat different design. Instead of directly listing output signals for every the opcode and funct, we'll first map the opcode and funct to intermediate "instruction format" and "instruction type" wires, then use those to figure out signals and parse the instruction.

As shown in the diagram, our control unit will be composed of 3 functions. The classifier will figure out the format/type of an instruction, and the parser and control core will output parsed instruction parts and control signals, respectively.

Hardcaml Implementation

To keep things simple, we'll only implement 4 instruction types for now: add, sub(tract), lw (load word), and sw (store word). add and sub are R-Type, and lw and sw are I-Type.

We'll implement the control unit via Hardcaml's Always DSL. This is a set of operations that allows us to use Verilog-like declarative syntax while keeping many of Hardcaml's benefits. It's useful for describing complicated logic.

Here's the "classifier" portion of our control unit:

let rtype_classifier instr =
  let funct = instr.:[(5, 0)] in
  let instr_type = Always.Variable.wire ~default:Instruction_type.default in
  Always.(
    compile
      [
        switch funct
          [
            (of_string "6'b100000", [ instr_type <-- Instruction_type.add ]);
            (of_string "6'b100010", [ instr_type <-- Instruction_type.sub ]);
          ];
      ]);
  Always.Variable.value instr_type

let classifier instr =
  let opcode = instr.:[(31, 26)] in
  let format = Always.Variable.wire ~default:Instruction_format.default in
  let instr_type = Always.Variable.wire ~default:Instruction_type.default in
  Always.(
    compile
      [
        switch opcode
          [
            ( of_string "6'b000000",
              [
                format <-- Instruction_format.r_type;
                instr_type <-- rtype_classifier instr;
              ] );
            ( of_string "6'b100011",
              [
                format <-- Instruction_format.i_type;
                instr_type <-- Instruction_type.lw;
              ] );
            ( of_string "6'b101011",
              [
                format <-- Instruction_format.i_type;
                instr_type <-- Instruction_type.sw;
              ] );
          ];
      ]);
  (Always.Variable.value format, Always.Variable.value instr_type)

This is a bit similar to the original Verilog version, but there are several advantages:

  • We can split the nested switch into a separate function
  • We're classifying the instruction into a type and a format, not directly listing control signals.

In that example, the Instruction_format.x and Instruction_type.x variables are just arbitrary, hardcoded constants. We need to represent the difference between add, sub, lw, and sw in hardware, so we'll just use the values 1, 2, 3, and 4. Think of this as a very crude version of enums. The Instruction_type module looks like this:

module Instruction_type = struct
  let default = of_string "6'h0"
  let add = of_string "6'h1"
  let sub = of_string "6'h2"
  let lw = of_string "6'h3"
  let sw = of_string "6'h4"
end

and Instruction_format is similar. default doesn't really mean anything, it's needed to instantiate Always DSL wires in the above classifier example.

We can then use the instruction format and type to generate control signal outputs:

let type_to_alu_control instr_type = 
  let aluc = Always.Variable.wire ~default:Alu_ops.default in
  Always.(
    compile
    [ switch instr_type 
      [
        Instruction_type.add, [aluc <-- Alu_ops.add];
        Instruction_type.sub, [aluc <-- Alu_ops.subtract];
        Instruction_type.lw, [aluc <-- Alu_ops.add];
        Instruction_type.sw, [aluc <-- Alu_ops.add];
      ];

    ]);
    Always.Variable.value aluc

let control_core format instr_type =
  let reg_write_enable =
    format ==: Instruction_format.r_type |: (instr_type ==: Instruction_type.lw)
  in
  let sel_mem_for_reg_data = instr_type ==: Instruction_type.lw in
  let mem_write_enable = instr_type ==: Instruction_type.sw in
  let sel_imm_for_alu = format ==: Instruction_format.i_type in
  let alu_control = type_to_alu_control instr_type in
  let module C = Control_signals in
  {
    C.reg_write_enable;
    sel_mem_for_reg_data;
    mem_write_enable;
    sel_imm_for_alu;
    alu_control;
  }

Note that our control signals are now expressed as functions of our type/format (e.g. write to memory iff the instruction type is "store word"). Also, type_to_alu_control follows the same pattern as the classifier example, except that we can also use variables for the switch cases.

Finally, we combine the classifier, splitter, and parser (source code on GitHub) to get our full control unit circuit implementation:

let circuit_impl (_scope : Scope.t) (input : _ I.t) =
  let instr_format, instr_type = classifier input.instruction in
  let parsed_instruction = parser input.instruction instr_format in
  let control_signals = control_core instr_format instr_type in
  { O.parsed_instruction; control_signals }

Potential Improvements

This is already a lot better than what we had with Verilog, but there's still room for improvement:

  • Having to specify an arbitrary value for each enum option is messy. It would be nice if that could be generated automatically.
  • If we could use OCaml Variants and match, the compiler would perform an exhaustivity check, forcing us to account for all options.
  • The default value we're forced to include in each set of constants shouldn't actually ever be outputted, so it would be preferable not to have it.

Luckily, Hardcaml has an Enum system that solves all of these! We've mostly figured it out with some help from the maintainers, and will rewrite the control unit to use it in a future post.

Other Always DSL Features

Our control unit's needs only scratch the surface of what the Always DSL can do, so I wanted to talk a bit about it. Essentially, there are 3 types of "always directives" (we'll refer to these as Always.t:

  • An assignment, IE some_wire <-- some_signal
  • An if, where the condition is a signal, and the "body" is a list of Always.t
  • A switch, where the selector is a signal, each case is a signal, and each case body is a list of Always.t

Note that the if and switch types are defined recursively in terms of Always.t. This means that we could have ifs nested in switches nested in more switches, and so on. In fact, any arbitrary combination of assignments, ifs, and switches is possible. The Always DSL is essentially a mini programming language embedded in Hardcaml, which is itself a mini programming language embedded in OCaml. Cool!

One of the nicest features of the Always DSL is its built-in support for state machines. I'm not going to provide examples since I haven't worked with it myself, but I highly recommend reading the relevant docs to learn about state machines and other advanced Always DSL features.

Previous articleNext article

Comments (0)

Hi, welcome back to our Hardcaml MIPS project! Today, we'll be exploring how we can use the Always DSL to write verilog-like code while keeping all of Hardcaml's safety checks. We'll use this to implement our CPU's control unit. If you'd like to see the end-result of this post, it's tagged as v0.6.1 on GitHub.

What is a Control Unit?

We can think of our CPU's stages as follows:

  1. Fetch
    1. Get the current instruction
  2. Decode
    1. Figure out what the instruction means
    2. Get current values of registers used by the instruction
  3. Execute
    1. Figure out what the ALU inputs are, depending on the instruction
    2. Execute some ALU operation on those inputs, depending on the instruction
  4. Memory
    1. Write or read to/from data memory, depending on the instruction
  5. Writeback
    1. Write to the register file, depending on the instruction

You've probably noticed that everything we do to process an instruction after we've decoded it depends on the instruction. MIPS has a lot of instructions, and each needs to be handled differently.

In a nutshell, the job of a control unit is to take an instruction, figure out what type it is, and output "control signals" for the rest of the CPU based on that type. These control signals include:

  • Whether the instruction writes to memory
  • Whether the instruction writes to the register file
  • Which ALU operation the instruction uses

It does so by checking the opcode and funct parts of the instruction against a hardcoded list. Recall that each instruction is 32 bits: the first 6 are the opcode, and the last 6 are the funct.

When we made this in Verilog, we did exactly that: we checked the opcode and funct against a bit list of options in a giantswitch statement (source code here). This isn't irredeemable, but there are a few downsides:

  • We label opcodes/functs using comments. It would be preferable to use variables, because changes to variables can be picked up by a compiler or static analysis.
  • We declare a list of signals for each instruction type. A more intuitive formulation would be some form of logic, e.g. "write to memory iff the instruction type is 'load word'".
  • All the logic ends up in one giant block, even though it does multiple things. Breaking it up into separate functions would be more readable/maintainable.

Shortly, we'll see how we can use Hardcaml's Always DSL to avoid some of these issues.

Before I continue, I want to note that there's one more job the instruction decode stage does: parsing out the instruction. There are actually several different instruction formats in MIPS:

And depending on the format/type of the instruction we're processing, we'll want to parse the instruction differently. For example, R-type instructions use rd for the writeback destination address, while I-type instructions use rt. In our Verilog version, we did this parsing separately, but it might be simpler to have it be part of the control unit.

A Better Design

For this project, we'll use a somewhat different design. Instead of directly listing output signals for every the opcode and funct, we'll first map the opcode and funct to intermediate "instruction format" and "instruction type" wires, then use those to figure out signals and parse the instruction.

As shown in the diagram, our control unit will be composed of 3 functions. The classifier will figure out the format/type of an instruction, and the parser and control core will output parsed instruction parts and control signals, respectively.

Hardcaml Implementation

To keep things simple, we'll only implement 4 instruction types for now: add, sub(tract), lw (load word), and sw (store word). add and sub are R-Type, and lw and sw are I-Type.

We'll implement the control unit via Hardcaml's Always DSL. This is a set of operations that allows us to use Verilog-like declarative syntax while keeping many of Hardcaml's benefits. It's useful for describing complicated logic.

Here's the "classifier" portion of our control unit:

let rtype_classifier instr =
  let funct = instr.:[(5, 0)] in
  let instr_type = Always.Variable.wire ~default:Instruction_type.default in
  Always.(
    compile
      [
        switch funct
          [
            (of_string "6'b100000", [ instr_type <-- Instruction_type.add ]);
            (of_string "6'b100010", [ instr_type <-- Instruction_type.sub ]);
          ];
      ]);
  Always.Variable.value instr_type

let classifier instr =
  let opcode = instr.:[(31, 26)] in
  let format = Always.Variable.wire ~default:Instruction_format.default in
  let instr_type = Always.Variable.wire ~default:Instruction_type.default in
  Always.(
    compile
      [
        switch opcode
          [
            ( of_string "6'b000000",
              [
                format <-- Instruction_format.r_type;
                instr_type <-- rtype_classifier instr;
              ] );
            ( of_string "6'b100011",
              [
                format <-- Instruction_format.i_type;
                instr_type <-- Instruction_type.lw;
              ] );
            ( of_string "6'b101011",
              [
                format <-- Instruction_format.i_type;
                instr_type <-- Instruction_type.sw;
              ] );
          ];
      ]);
  (Always.Variable.value format, Always.Variable.value instr_type)

This is a bit similar to the original Verilog version, but there are several advantages:

  • We can split the nested switch into a separate function
  • We're classifying the instruction into a type and a format, not directly listing control signals.

In that example, the Instruction_format.x and Instruction_type.x variables are just arbitrary, hardcoded constants. We need to represent the difference between add, sub, lw, and sw in hardware, so we'll just use the values 1, 2, 3, and 4. Think of this as a very crude version of enums. The Instruction_type module looks like this:

module Instruction_type = struct
  let default = of_string "6'h0"
  let add = of_string "6'h1"
  let sub = of_string "6'h2"
  let lw = of_string "6'h3"
  let sw = of_string "6'h4"
end

and Instruction_format is similar. default doesn't really mean anything, it's needed to instantiate Always DSL wires in the above classifier example.

We can then use the instruction format and type to generate control signal outputs:

let type_to_alu_control instr_type = 
  let aluc = Always.Variable.wire ~default:Alu_ops.default in
  Always.(
    compile
    [ switch instr_type 
      [
        Instruction_type.add, [aluc <-- Alu_ops.add];
        Instruction_type.sub, [aluc <-- Alu_ops.subtract];
        Instruction_type.lw, [aluc <-- Alu_ops.add];
        Instruction_type.sw, [aluc <-- Alu_ops.add];
      ];

    ]);
    Always.Variable.value aluc

let control_core format instr_type =
  let reg_write_enable =
    format ==: Instruction_format.r_type |: (instr_type ==: Instruction_type.lw)
  in
  let sel_mem_for_reg_data = instr_type ==: Instruction_type.lw in
  let mem_write_enable = instr_type ==: Instruction_type.sw in
  let sel_imm_for_alu = format ==: Instruction_format.i_type in
  let alu_control = type_to_alu_control instr_type in
  let module C = Control_signals in
  {
    C.reg_write_enable;
    sel_mem_for_reg_data;
    mem_write_enable;
    sel_imm_for_alu;
    alu_control;
  }

Note that our control signals are now expressed as functions of our type/format (e.g. write to memory iff the instruction type is "store word"). Also, type_to_alu_control follows the same pattern as the classifier example, except that we can also use variables for the switch cases.

Finally, we combine the classifier, splitter, and parser (source code on GitHub) to get our full control unit circuit implementation:

let circuit_impl (_scope : Scope.t) (input : _ I.t) =
  let instr_format, instr_type = classifier input.instruction in
  let parsed_instruction = parser input.instruction instr_format in
  let control_signals = control_core instr_format instr_type in
  { O.parsed_instruction; control_signals }

Potential Improvements

This is already a lot better than what we had with Verilog, but there's still room for improvement:

  • Having to specify an arbitrary value for each enum option is messy. It would be nice if that could be generated automatically.
  • If we could use OCaml Variants and match, the compiler would perform an exhaustivity check, forcing us to account for all options.
  • The default value we're forced to include in each set of constants shouldn't actually ever be outputted, so it would be preferable not to have it.

Luckily, Hardcaml has an Enum system that solves all of these! We've mostly figured it out with some help from the maintainers, and will rewrite the control unit to use it in a future post.

Other Always DSL Features

Our control unit's needs only scratch the surface of what the Always DSL can do, so I wanted to talk a bit about it. Essentially, there are 3 types of "always directives" (we'll refer to these as Always.t:

  • An assignment, IE some_wire <-- some_signal
  • An if, where the condition is a signal, and the "body" is a list of Always.t
  • A switch, where the selector is a signal, each case is a signal, and each case body is a list of Always.t

Note that the if and switch types are defined recursively in terms of Always.t. This means that we could have ifs nested in switches nested in more switches, and so on. In fact, any arbitrary combination of assignments, ifs, and switches is possible. The Always DSL is essentially a mini programming language embedded in Hardcaml, which is itself a mini programming language embedded in OCaml. Cool!

One of the nicest features of the Always DSL is its built-in support for state machines. I'm not going to provide examples since I haven't worked with it myself, but I highly recommend reading the relevant docs to learn about state machines and other advanced Always DSL features.

12 days later
Write a Reply...