Preprocessing Module
Image preparation and analysis tools
Overview
The preprocessing module provides tools for analyzing and adjusting archaeological drawings before conversion. It ensures optimal input quality through statistical analysis and targeted adjustments.
Dataset Analyzer
class DatasetAnalyzer:
def __init__(self):
self.metrics = {}
self.distributions = {}
A comprehensive tool for analysing collections of archaeological drawings, establishing statistical baselines for quality control.
Key Methods
analyze_image
def analyze_image(self, image: Union[str, Image.Image]) -> dict
Extracts key metrics from a single drawing.
Returns
- mean: Average brightness
- std: Standard deviation
- contrast_ratio: Dynamic range measure
- median: Middle intensity value
- dynamic_range: Total intensity range
- entropy: Image information content
- iqr: Inter-quartile range
- non_empty_ratio: Drawing density measure
analyze_dataset
def analyze_dataset(
self,
str,
dataset_path: tuple = ('.png', '.jpg', '.jpeg')
file_pattern: -> dict )
Builds statistical distributions from a collection of drawings.
visualize_distributions_kde
def visualize_distributions_kde(
self,
str]] = None,
metrics_to_plot: Optional[List[bool = False
save: )
Creates KDE plots of metric distributions with statistical annotations.
save_analysis
def save_analysis(self, path: str) -> None
Saves the current analysis results to a file for later use. This is particularly useful when establishing reference metrics for a specific archaeological context or drawing style.
Parameters
path
-
File path to save the analysis results
Examples
= DatasetAnalyzer()
analyzer = analyzer.analyze_dataset("reference_drawings/")
stats "reference_metrics.npy") analyzer.save_analysis(
load_analysis
@classmethod
def load_analysis(cls, path: str) -> 'DatasetAnalyzer'
Class method that loads previously saved analysis results. This allows reuse of established reference metrics without reanalyzing the dataset.
Parameters
path
-
Path to previously saved analysis file
Returns
Returns a new DatasetAnalyzer instance with loaded analysis results
Examples
# Load previously computed statistics
= DatasetAnalyzer.load_analysis("reference_metrics.npy")
analyzer
# Use loaded stats for quality checks
= check_image_quality("new_drawing.jpg", analyzer.distributions) check
These methods enable efficient reuse of analysis results across multiple processing sessions, particularly valuable when working with established archaeological documentation standards or specific site collections.
Process Folder Metrics
def process_folder_metrics(
str,
input_folder: dict,
model_stats: tuple = ('.jpg', '.jpeg', '.png')
file_extensions: -> None )
Batch processes a folder of drawings to align their metrics with reference statistics.
Parameters
input_folder
-
Directory containing drawings to process
model_stats
-
Reference statistics from DatasetAnalyzer
file_extensions
-
Supported file types
Apply Recommended Adjustments
def apply_recommended_adjustments(
str, Image.Image],
image: Union[dict,
model_stats: bool = True
verbose: -> Image.Image )
Automatically adjusts a drawing based on statistical analysis.
Parameters
image
-
Drawing to adjust
model_stats
-
Reference statistics
verbose
-
Print adjustment details
Adjustments Applied
- Contrast normalization
- Brightness alignment
- Standard deviation correction
- Dynamic range optimization
Examples
= apply_recommended_adjustments(
adjusted "drawing.jpg",
reference_stats,=True
verbose )
Check Image Quality
def check_image_quality(
str, Image.Image],
image: Union[dict
model_stats: -> dict )
Evaluates a drawing against reference metrics to identify needed adjustments.
Returns
Returns a dictionary containing:
- metrics: Current image measurements
- recommendations: List of suggested adjustments
- is_compatible: Boolean indicating if adjustments needed
Examples
= check_image_quality("new_drawing.jpg", reference_stats)
check if not check['is_compatible']:
print("Adjustments needed:", check['recommendations'])
Visualize Metrics Change
def visualize_metrics_change(
dict,
original_metrics: dict,
adjusted_metrics: dict,
model_stats: str]] = None,
metrics_to_plot: Optional[List[bool = False
save: -> None )
Creates detailed visualizations comparing original and adjusted metrics against reference distributions.
Parameters
original_metrics
-
Metrics before adjustment
adjusted_metrics
-
Metrics after adjustment
model_stats
-
Reference statistics
metrics_to_plot
-
Specific metrics to visualize
save
-
Save plot to file
Examples
visualize_metrics_change(
original_metrics,
adjusted_metrics,
reference_stats,=['contrast_ratio', 'mean', 'std']
metrics_to_plot )