Skip to content
Snippets Groups Projects
Commit 5a57b9ad authored by cboulanger's avatar cboulanger
Browse files

error message

parent 6c3f7f80
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id:d6264ff5d5024ba1 tags:
# Finetuning experiments
based on https://github.com/ml-explore/mlx-examples/tree/main/lora
%% Cell type:markdown id:f3d8cb11f32bf4de tags:
## Create a 4-Bit quantized model
%% Cell type:code id:8bb22a5cb2ec1db0 tags:
``` python
!python convert.py --hf-path mistralai/Mistral-7B-v0.1 -q
```
%% Cell type:markdown id:1135fbc8a6ced279 tags:
## Create training data
### Download website data
This only downloads new content if the list of journals has been changed or already downloaded files have been deleted. To overwrite existing files, use `overwrite=True`
%% Cell type:code id:9eb2effc7bfb22f tags:
``` python
from lib.prepare_training_data import download_input_data
download_input_data(input_file='data/editors.csv',
output_dir='data/website-data',
overwrite=False)
```
%% Output
Downloaded 0 web pages.
%% Cell type:code id:31a2389404720256 tags:
``` python
from lib.prepare_training_data import create_training_file
instruction = "Below is the content of a website of a German law journal. For each member of the editorial board or the advisory board, extract the following information: lastname, firstname, title, position, affiliation, role. Return as a YAML list of dictionaries. Omit keys that you cannot find information for."
create_training_file(instruction=instruction,
input_file='data/editors.csv',
output_dir='data',
website_dir='data/website-data',
cols_to_remove = ['journal_abbr', 'website', 'retrieved_on'],
column_to_filter_by='lastname',
lines_before=2, lines_after=1)
```
%% Cell type:code id:fd1a48e84474aaea tags:
``` python
!python lora.py --model mlx_model/Mistral-7B-v0.1 --train --iters 600 --batch-size 1 --lora-layers 4
```
%% Cell type:code id:51b420d949a23c54 tags:
%% Cell type:markdown id:7e10d007a2d411f0 tags:
``` python
```
$ python lora.py --model mlx_model/Mistral-7B-v0.1 --train --iters 600 --batch-size 1 --lora-layers 4
Loading pretrained model
Total parameters 1242.763M
Trainable parameters 0.426M
Loading datasets
Training
Iter 1: Val loss 1.805, Val took 93.856s
Iter 10: Train loss 1.275, It/sec 0.144, Tokens/sec 115.780
[WARNING] Some sequences are longer than 2048 tokens. Consider pre-splitting your data to save memory.
Iter 20: Train loss 1.052, It/sec 0.087, Tokens/sec 92.686
[WARNING] Some sequences are longer than 2048 tokens. Consider pre-splitting your data to save memory.
Iter 30: Train loss 1.230, It/sec 0.110, Tokens/sec 91.892
[WARNING] Some sequences are longer than 2048 tokens. Consider pre-splitting your data to save memory.
Iter 40: Train loss 1.032, It/sec 0.109, Tokens/sec 91.080
Iter 50: Train loss 0.977, It/sec 0.128, Tokens/sec 95.607
Iter 60: Train loss 1.021, It/sec 0.166, Tokens/sec 94.361
[WARNING] Some sequences are longer than 2048 tokens. Consider pre-splitting your data to save memory.
Iter 70: Train loss 1.077, It/sec 0.097, Tokens/sec 87.647
[WARNING] Some sequences are longer than 2048 tokens. Consider pre-splitting your data to save memory.
libc++abi: terminating due to uncaught exception of type std::runtime_error: [METAL] Command buffer execution failed: Insufficient Memory (00000008:kIOGPUCommandBufferCallbackErrorOutOfMemory)
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment