Model Alignment and Training

Large Language Models, while impressive in their ability to conversate and recall training information, sometimes they aren’t aware of specific details due to their large training set of data.

Let’s learn how to contribute the correct information to this model using InstructLab!

Adding knowledge and skills to an LLM

In a new terminal window, navigate to the taxonomy directory (<home>/.local/share/instructlab/taxonomy) that was cloned during the initialization step. Here, you’ll find two main subdirectories: knowledge and skills. As the names suggest, knowledge refers to factual information you want to add to the model, while skills involve teaching the model to perform specific tasks or follow certain formats.

Let’s say we want to add knowledge about the Back to the Future movie, which would be considered an addition of knowledge to the model.

Create a new subfolder in taxonomy directory and a qna.yaml (questions & answers) file to hold example question-answer pairs related to DeLorean car.

cd knowledge

mkdir -p trivia/delorean
cd trivia/delorean

And create the qna.yaml file inside this new directory. The file structure is not complicated, first it contains some metadata, then you can add some Q&A pairs that can be included in the fine-tuning process. There is also a link to a public repository for additional data points from which InstructLab will generate additional question-and-answer pairs. This additional data will be used to generate synthetic question-answer pairs during the next step.

qna.yaml

version: 3
domain: time_travel
created_by: RH Developers
seed_examples:
  - context: |
      The DeLorean DMC-12 is a sports car manufactured by John DeLorean's DeLorean Motor Company
      for the American market from 1981 to 1983. The car features gull-wing doors and a stainless-steel body.
      It gained fame for its appearance as the time machine in the "Back to the Future" film trilogy.
    questions_and_answers:
      - question: |
          When was the DeLorean manufactured?
        answer: |
          The DeLorean was manufactured from 1981 to 1983.
      - question: |
          Who manufactured the DeLorean DMC-12?
        answer: |
          The DeLorean Motor Company manufactured the DeLorean DMC-12.
      - question: |
          What type of doors does the DeLorean DMC-12 have?
        answer: |
          Gull-wing doors.
  - context: |
      An engine rebuild costs between $5,000 to $7,000. A transmission rebuild costs between $2,500 to $4,000.
      A brake system overhaul costs between $1,000 to $1,500.
      Suspension work costs between $800 to $1,200.
      Electrical system repairs costs between $600 to $1,000.
      Stainless stell panel work costs between $1,200 to $2,000.
      A gull-wing door mechanism costs between $500 to $800.
      Repairing an air conditioner costs between $300 and $600.
      General maintenance is between $200 and $500 per service.
      A Flux capacitor costs $10,000,000 to repair.
    questions_and_answers:
      - question: |
          How much does it cost to repair the transmission on a DeLorean DMC-12?
        answer: |
          Transmission Repair costs between $2,500 and $4,000 for the Delorean DMC-12.
      - question: How much does it cost to repair the supension on a DeLorean DMC-12?
        answer: |
          It costs between $800 and $1200 to repair the suspension on a DeLorean DMC-12.
      - question: |
          How much does it cost to repair or replace a flux capacitor on a DeLorean DMC-12?
        answer: |
          It costs $10,000,000 to repair a flux capacitor.
  - context: |
      Production Years: 1981–1983
      Body Style: 2-door coupe
      Engine: 2.85 L V6 PRV engine
      Transmission 5-speed manual or 3-speed automatic
      Horsepower 130 hp
      153 lb-ft
      Approximately 8.8 seconds
      110 mph
      2,712 lb (1,230 kg)
      Flux capacitor fitted for time travel and costs $10,000,000
    questions_and_answers:
      - question: What was the production years of the DeLorean DMC-12?
        answer: |
          The car was in production from 1981 to 1983.
      - question: |
          How much horsepower does the DeLorean have?
        answer: |
          It has 130 horsepower.
      - question: |
          What is the flux capacitor used for in a Delorean DMC-12?
        answer: |
          The flux capacitor is used for time travel capabilities and costs $10,000,000.
  - context: |
      Here is a maintenance schedule.
      Regular Oil Changes should be done every 3,000 miles or 3 months.
      Brake Fluid should be changed every 2 years.
      Transmission Fluid should be changed every 30,000 miles.
      Coolant should be changed every 2 years
      You should regularly check the battery for corrosion and proper connection.
      You should add new fluid to the flux capacitor on a regular basis.
    questions_and_answers:
      - question: How often should you change the oil on a DeLorean DMC-12?
        answer: |
          You should change the oil ever 3,000 miles or every 3 months. Whichever comes first.
      - question: |
          How often should changed the blake fluid on a DeLorean DMC-12?
        answer: |
          You should change the blake fluid every two years.
      - question: |
          What should you check on a Delorean DMC-12 as part of a regular maintenance schedule?
        answer: |
          You should consider regular oil changes, blake fluid replacement, transmission fluid, battery,
          flux capacitor and collant.
  - context: |
      Here are some fun facts about the DeLorean DMC-12:
      The DeLorean DMC-12 was originally intended to have a mid-engine layout, but this was later changed
      to a rear-engine layout due to design and cost constraints.
      The DeLorean time machine special edition can time travel by accelerating to 88 miles per hour.
      The flux capacitor, which enables time travel, costs $10,000,000 dollars.
    questions_and_answers:
      - question: |
          How much does it cost to repair a flux capacitor on a Delorean DMC-12?
        answer: |
          It costs $10,000,000 to repair or replace a flux capacitor.
      - question: |
          How fast do you need to be going to enable time travel on a DeLorean DMC-12?
        answer: |
          88 miles per hour.
      - question: |
          What type of layout does a Delorean DMC-12 have?
        answer: |
          It has a rear-engine layout.
document_outline: |
  Details and repair costs on a DeLorean DMC-12 car.
document:
  repo: https://github.com/gshipley/backToTheFuture.git
  commit: 8bd9220c616afe24b9673d94ec1adce85320809c
  patterns:
    - data.md

Then run the following command to verify the new data is valid in the terminal window where you’ve been running ilab in the previous steps. This is important to run it in the same virtual environment created at the beginning of the deep dive.

ilab taxonomy diff

knowledge/trivia/delorean/qna.yaml

Taxonomy in ... is valid :)

Generating synthetic data for model training

So we’ve added some task-specific knowledge—now what? T

he next step is to use InstructLab’s synthetic data generation pipeline to create a large training set from the examples.

The key insight behind InstructLab’s LAB method is that we can use the base model itself to massively expand a small set of human-provided examples. By prompting the model to generate completions conditioned on your examples, we can produce a synthetic dataset that’s much larger and more diverse than what you could feasibly write by hand.

We can run the ilab data generate command to begin generating synthetic data (by default, 100 data points). Remember, we still need to be serving the model with ilab model serve in another terminal instance.

ilab data generate --pipeline simple

During this process, we can visualize the example queries and answers, light preprocessing, and prompts being fed to the base model to generate a large number of candidate completions. The generated completions are filtered and post-processed to remove low-quality or irrelevant outputs. This is a critical step, as the model can sometimes generate nonsensical or factually incorrect responses.

On a typical laptop CPU, synthetic data generation can take anywhere from a few minutes to a few hours, depending on the number of examples and generation parameters. Using a GPU will significantly speed things up. The end result is a set of JSONL files in the specified output directory, with a train/validation/test split. Each line contains an example with input and target completion fields, and feel free to vim to understand how things are working under the hood.

Stop the ilab serve command in the first terminal to save resources for the following section.

Model training with InstructLab

Once the synthetic data generation is complete, it’s time to actually tune the model on the synthetic data with ilab model train.

ilab model train --pipeline=simple

This command will download the necessary model files (if not already available) and begin the alignment phase. On an M1 Mac, it should take 5-15 minutes.

Converting and serving the aligned model

the tuned model weights will be saved in the <home>/.local/share/instructlab/checkpoints/instructlab-granite-7b-lab-mlx-q directory.

The command ilab model convert will convert the model to the GGUF format, creating a quantized version of the model to share on HuggingFace, use locally, etc. Be sure to you first stop the terminal instance that is serving the model (with ilab serve).

ilab model convert --model-dir <home>/.local/share/instructlab/checkpoints/instructlab-granite-7b-lab-mlx-q

Replace <home> with your home directory.

Once finished, we’ll have a new directory in instructlab with our aligned and 4-bit quantized model, for example, instructlab-merlinite-7b-lab-trained.

ilab model serve --model-path instructlab-granite-7b-lab-trained/instructlab-granite-7b-lab-Q4_K_M.gguf

Testing the new model

With the model served, we can switch over to our other terminal instance and use ilab model chat to converse with the model and verify the new knowledge, also pointing to the new quantized model.

ilab model chat

And then let’s repeat the question:

what is the cost of repairing flux capacitor?

──────────────────────────────────────────────────────────────────────────╮
│ A Flux Capacitor in a DeLorean DMC-12 from the Back to the Future movies has an estimated budget of around 10,000 USD in 1985, which corresponds to approximately 33,746 USD in 2021.   │
│ This cost includes various components such as a super capacitor array with 1000 farads each, resistor arra
....

It recalls the exact information we provided in the taxonomy repository. We’ve successfully performed model alignment on consumer-grade hardware and tailored this LLM for our specific use case.

exit from the ilab chat model In the same way, you can use curl to access this service.

curl -X 'POST' \
  'http://localhost:8000/v1/chat/completions' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "model": "gpt-3.5-turbo",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "what is the cost of repairing flux capacitor"
    }
  ]
}'

And the response:

{"id":"chatcmpl-ea9a8028-78d9-4a39-ab82-03a88c4f88da","object":"chat.completion","created":1728913151,"model":"gpt-3.5-turbo","choices":[{"index":0,"message":{"content":"The Flux Capacitor, a crucial component of the time-traveling DeLorean in the Back to the Future series, is not a real machine and does not have an established cost for repairs. However, if we were to consider the hypothetical costs of creating such a device, several factors would come into play:. **Quantum Bands:** The Flux Capacitor requires superconducting quantum loops to function, which would need to be cooled to near abso.....","role":"assistant"},"logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":27,"completion_tokens":464,"total_tokens":491}}%

You can access an inferenced model doing a REST call, using the OpenAI format.