Guide to Fine-tuning PyPotteryInk Models

This guide walks you through the process of fine-tuning a PyPotteryInk model for your specific archaeological context.

Prerequisites

Before starting the fine-tuning process, ensure you have:

  1. A GPU with at least 20GB VRAM for training
  2. Python 3.10 or higher
  3. A paired dataset of pencil drawings and their inked versions
  4. Storage space for model checkpoints and training data

Environment Setup

  1. First, clone the repository:
git clone https://github.com/GaParmar/img2img-turbo.git
  1. Install the required dependencies:
pip install -r img2img-turbo/requirements.txt
pip install git+https://github.com/openai/CLIP.git
pip install wandb vision_aided_loss huggingface-hub==0.25.0

Dataset Preparation

  1. To create a training dataset check out the original docs (https://github.com/GaParmar/img2img-turbo/blob/main/docs/training_pix2pix_turbo.md)

  2. Important considerations for dataset preparation:

    • Images should be paired (same filename in both folders)
    • Standard image formats (jpg, jpeg, png) are supported
    • Both pencil and inked versions should be aligned
    • Recommended resolution: at least 512x512 pixels
  3. Data requirements:

    • Minimum recommended: 10-20 pairs for fine-tuning
    • Each drawing should be clean and well-scanned
    • Include variety in pottery types and decorations
    • Consistent drawing style across the dataset

Setting Up Fine-tuning

To fine-tune a pre-trained model (like “6h-MCG”), you’ll need to modify the base img2img-turbo repository. This enables the use of a pre-trained model as a starting point for your specialized training.

  1. Prepare the Repository:
    • Navigate to your cloned img2img-turbo directory
    cd img2img-turbo
  2. Replace Key Files:
    • Copy these files from the PyPotteryInk repository’s “fine-tuning” folder into the src folder:

      • pix2pix_turbo.py
      • train_pix2pix_turbo.py
    These modified files contain the necessary adaptations to support training from a pre-trained model.

Running Fine-tuning

  1. Initialize Accelerate Environment:

    accelerate config

    This will guide you through setting up your training environment. Follow the prompts to configure for your GPU setup.

  2. Start Training:

    accelerate launch src/train_pix2pix_turbo.py \
        --pretrained_model_name_or_path="6h-MCG.pkl" \
        --output_dir="YOUR_OUTPUT_DIR" \
        --dataset_folder="YOUR_INPUT_DATA" \
        --resolution=512 \
        --train_batch_size=2 \
        --enable_xformers_memory_efficient_attention \
        --viz_freq 25 \
        --track_val_fid \
        --report_to "wandb" \
        --tracker_project_name "YOUR_PROJECT_NAME"

    Key Parameters:

    • pretrained_model_name_or_path: Path to your pre-trained model (e.g., “6h-MCG.pkl”)
    • output_dir: Where to save training outputs and checkpoints
    • dataset_folder: Location of your training dataset
    • resolution: Image resolution (512 recommended)
    • train_batch_size: Number of images per training batch
    • viz_freq: How often to generate visualization samples
    • track_val_fid: Enable FID score tracking
    • tracker_project_name: Your Weights & Biases project name

    Note: Adjust the batch size based on your GPU memory. Start with 2 and increase if your GPU can handle it.

Important Considerations

  • Ensure your pre-trained model file (e.g., “6h-MCG.pkl”) is in the correct location
  • Monitor GPU memory usage during training
  • Use Weights & Biases (wandb) to track training progress
  • Check the output directory periodically for sample outputs
  • Training time will vary based on your GPU and dataset size