Computer Vision News - February 2022
42 Medical Imaging Tools What happens when there is/are duplicates in the train and valid sets? To see what happens I have another dataset that has duplicate images duplicate_ds = get_dicom_files(f'{pneu}/sm') check_duplicate(duplicate_ds, valid_pct=0.2, seed=7) Train: 211 Original Validation: 52 Updated Validation: 48 In this case the original split had 211 images in the train set and 52 images in the validation set. check_duplicates was able to find 4 duplicates in the validation set and remove
Made with FlippingBook
RkJQdWJsaXNoZXIy NTc3NzU=