Reporting Module
The reporting module handles visualization and export of results.
Report Generator
Data Aggregator
Data aggregation module for ForzaEmbed reporting.
This module provides the DataAggregator class that handles aggregation and caching of processed data from the database for report generation.
Example
Aggregate data for reporting:
from src.reporting.aggregator import DataAggregator
aggregator = DataAggregator(db, output_dir, "config_name")
data = aggregator.get_aggregated_data()
- class src.reporting.aggregator.DataAggregator(db, output_dir, config_name)[source]
Bases:
objectHandle aggregation and caching of processed data from the database.
Aggregates processing results from the database into a format suitable for report generation, with caching to avoid redundant computation.
- db
The embedding database containing results.
- output_dir
Directory path for cache files.
- cache_path
Path to the cache file.
- __init__(db, output_dir, config_name)[source]
Initialize the DataAggregator.
- Parameters:
db (EmbeddingDatabase) – The embedding database containing results.
output_dir (Path) – Directory path for cache files.
config_name (str) – Name of the configuration for cache file prefix.
- get_aggregated_data()[source]
Load aggregated data from cache if valid, otherwise aggregate from scratch.
Checks if the cache is newer than the database modification time. If valid, loads from cache; otherwise, aggregates fresh data.
- Returns:
Dictionary containing aggregated data for reporting, or None if no processing results are available. Contains keys:
all_results: Raw results from database.
processed_data_for_interactive_page: Optimized web data.
all_models_metrics: Metrics organized by model.
model_embeddings_for_variance: Embeddings for analysis.
total_combinations: Count of model combinations.
- Return type:
Web Generator
Web page generation module for ForzaEmbed.
This module provides functions for generating interactive HTML pages for visualising embedding analysis results, including heatmaps and comparison charts.
Templates are maintained as separate files under
src/reporting/templates/ for easier editing:
template.html— HTML structure with%%PLACEHOLDER%%markersstyle.css— Professional report stylesheet (minified at build time)main.js— Interactive report logicworker.js— Web Worker for Base64/zlib decompression
Example
Generate an interactive web page:
from src.reporting.web_generator import generate_main_page
generate_main_page(
processed_data, output_dir, total_combinations,
single_file=True, config_name="my_config"
)
- src.reporting.web_generator.safe_numpy_converter(obj)[source]
Recursively convert NumPy types to native Python types for JSON serialisation.
- src.reporting.web_generator.generate_main_page(processed_data, output_dir, total_combinations, single_file=False, graph_paths=None, config_name='config', themes_config=None)[source]
Generate the main interactive web page for heatmap visualisation.
Creates HTML files with embedded JavaScript for interactive exploration of embedding similarity results.
- Parameters:
processed_data (dict[str, Any]) – Dictionary containing processed analysis data.
output_dir (str) – Directory path for output HTML files.
total_combinations (int) – Total number of model combinations processed.
single_file (bool) – If True, creates a single index.html for all files. If False, creates one HTML file per markdown. Defaults to False.
graph_paths (dict[str, list[str]] | None) – Dictionary mapping file keys to lists of graph image paths.
config_name (str) – Name of the configuration for file prefixes.
themes_config (dict[str, Any] | None) – Theme configuration for tooltip display.
Markdown Filter
Markdown filtering module for ForzaEmbed.
This module provides the MarkdownFilter class that handles generation of filtered markdown files based on similarity threshold, extracting only the chunks that are above the threshold.
Example
Generate filtered markdown files:
from src.reporting.markdown_filter import MarkdownFilter
filter = MarkdownFilter(db, config, output_dir, "config_name")
filter.generate_filtered_markdowns()
- class src.reporting.markdown_filter.MarkdownFilter(db, config, output_dir, config_name)[source]
Bases:
objectHandle generation of filtered markdown files based on similarity threshold.
Creates filtered versions of input markdown files containing only the text chunks that exceed the similarity threshold for each model.
- db
The embedding database containing results.
- config
Configuration dictionary with filter settings.
- output_dir
Directory path for output files.
- config_name
Name of the configuration for file prefixes.
- similarity_threshold
Minimum similarity for including chunks.