Video Deduplication
Bayanat integrates with Benetech's video deduplication tool (JusticeAI). Import matches generated by the tool for video Bulletins already in Bayanat. Bayanat processes matches into Bulletin-to-Bulletin relationships based on the distance between videos.
Importing Matches
From the Bayanat root directory:
source .venv/bin/activateImport the CSV file generated by the generate_matches.py script:
flask dedup-import /path/to/matches.csvThe command will prompt about overwriting existing matches.
Processing Matches
The deduplication tool provides a distance value (0 to 0.7) for each match between two videos.
Bayanat uses the following criteria to create relationships:
- Potentially Duplicate: distance below 0.3
- Potentially Related: distance between 0.3 and 0.5
These thresholds are based on SJAC's testing and may be adjusted.
Bayanat looks up both Bulletins in each match. If no existing relationship exists, a new one is created based on the criteria above.